Seasonal forest fire risk and key drivers in Yunnan Province: a machine learning approach

Machine Learning


Annual and seasonal variations in forest fires in Yunnan Province

From 2001 to 2020, a total of 9,404 forest fires occurred in Yunnan Province, with an annual average of 470 incidents. The interannual variation in forest fire occurrences exhibited significant fluctuations. The year 2020 had the highest number of forest fires, accounting for 14.12% of the total, while 2018 had the fewest, comprising only 1.08%. The interannual variation in forest fire occurrences can be divided into two distinct phases: 1) 2001–2010: Increasing Phase. During this period, a total of 5,796 forest fires were recorded, with a peak in 2004, accounting for 13.54% of the fires in this phase. Notably, forest fires were particularly frequent between 2003 and 2005, with an average of 723 incidents per year, exceeding the overall annual average. 2) 2010–2020: Decreasing Phase. In this phase, a total of 3,608 forest fires occurred, characterized by high interannual variability. The peak year was 2014, contributing 22.03% of the total fires during this period. Between 2012 and 2014, forest fires occurred frequently, with an annual average of 734 incidents, again exceeding the overall average.

The seasonal variation in forest fire occurrences from 2001 to 2020 was also prominent. Forest fires were most frequent in the spring (March, April, and May) and winter (December, January, and February). In spring, a total of 6,661 forest fires were recorded, accounting for 70.83% of the total, significantly higher than in other seasons. Winter followed with 2,556 fires, representing 27.17% of the total. In contrast, forest fires were rare in summer and autumn, with these two seasons combined accounting for only 1.98% of the total (Fig. 2).

Fig. 2: Statistics of forest fires in Yunnan Province from 2001 to 2020.
figure 2

Note: This image was created using Origin 2021.

Model performance comparison

We calculated the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve and the prediction accuracy of each model using each subsample and the complete dataset for annual and seasonal data (Table 1). AUC values, prediction accuracy, and ROC curves were utilized to evaluate and compare the predictive performance of the six models for both annual and seasonal datasets. The AUC values of all six models exceeded 0.72, and the prediction accuracy was higher than 66%, indicating that all models achieved good fitting performance for both annual and seasonal data (Fig. 3).

Fig. 3: AUC values based on a complete dataset of six models for the full year and different seasons.
figure 3

a Full year. b Spring. c Summer. d Autumn. e Winter. Note: This image was created using R studio 4.4.0.

Table 1 Comparison of prediction accuracy and goodness of fit of six models

For the annual dataset of the study area, the test results showed that the RF, BRT, and XGBoost models had higher AUC values and prediction accuracy compared to the other three models. The AUC values of these three models exceeded 0.91, and their prediction accuracy exceeded 82%. Overall, the BRT model exhibited superior predictive performance compared to the RF and XGBoost models. Meanwhile, the other three models (LR, GR, SVM) also performed well, with AUC values exceeding 0.75 and prediction accuracy above 70%. From the analysis of the ROC curve, compared to the RF model, the ROC curve of the BRT model tends to the upper left corner of the coordinate system, which reflects that the BRT model performs better on the ROC curve as an evaluation indicator. From the analysis of model prediction accuracy, the BRT model has a prediction accuracy of 90.6%, while the RF model has a prediction accuracy of 87.1%, indicating that the BRT model exhibits higher prediction accuracy in the annual forest fire analysis in Yunnan Province. From the analysis of AUC values, both RF and BRT models have AUC values above 0.94, indicating that both models are suitable for predicting forest fires in Yunnan Province. However, the AUC value of the BRT model is higher than that of the RF model, indicating that the BRT model has higher evaluation and diagnostic value in predicting forest fires in Yunnan Province throughout the year, and is more suitable for predicting and analyzing forest fires in Yunnan Province. The predictive performance of the model is better.

The test results for the spring dataset showed that the RF, BRT, and XGBoost models had higher AUC values and prediction accuracy compared to the other three models. The AUC values of these three models were all above 0.92, and their prediction accuracy exceeded 85%, with the BRT model demonstrating the best predictive performance compared to the RF and XGBoost models. The other three models (LR, GR, SVM) achieved AUC values above 0.83 and prediction accuracy exceeding 75%. The test results for the summer dataset revealed that all six models achieved AUC values above 0.92 and prediction accuracy above 88%. The XGBoost model had the best performance, with an AUC value of 0.989 and a prediction accuracy of 95.1%, followed by the BRT and SVM models. The test results for the autumn dataset indicated that the RF, BRT, and XGBoost models had higher AUC values and prediction accuracy than the other three models. The AUC values of these three models were all above 0.93, and their prediction accuracy exceeded 87%. Overall, the XGBoost model demonstrated the best performance, with an AUC value of 0.996 and a prediction accuracy of 95.2%, outperforming the RF and BRT models. The other three models (LR, GR, SVM) had AUC values above 0.83 and prediction accuracy exceeding 81%. The test results for the winter dataset showed that the BRT model outperformed the other five models in terms of AUC value and prediction accuracy, with an AUC value of 0.992 and a prediction accuracy of 95.7%.

The results indicate that the BRT model is the optimal choice for predicting forest fires in the annual, spring, and winter datasets due to its high prediction accuracy and superior goodness-of-fit. For the summer and autumn datasets, the XGBoost model is identified as the best option because of its efficient processing capability and high prediction accuracy. While other models demonstrated good performance on certain metrics, the BRT and XGBoost models exhibited significant advantages in evaluation indicators such as AUC values. Consequently, this study recommends the BRT model as the optimal model for predicting forest fires for the annual, spring, and winter datasets, while the XGBoost model is recommended as the optimal model for predicting forest fires in the summer and autumn datasets.

Importance ranking of driving factors

An analysis of the differences in the proportion of driving factors across the six models for annual and seasonal datasets revealed slight variations in the importance of these factors in influencing forest fire occurrences (Fig. 4). Overall, meteorological factors accounted for nearly half of the total importance, making them the key drivers of forest fire occurrences in the study area. All six models indicated that Minhum or Hum and Max_temp were the key factors influencing forest fire occurrences. Additionally, Temp_diff showed a high importance proportion in the GR, RF, BRT, and SVM models, further emphasizing its role as a significant driver of forest fire occurrences.

Fig. 4: Selection of important driving factors for different seasons by each model used.
figure 4

a LR. b GR. c RF. d BRT. e XGBoost. f SVM. Note: This image was created using Origin 2021.

The variations in the proportion of driving factors for the spring dataset showed that Minhum and Max_temp had high importance proportions across all models. Additionally, Temp_diff showed a higher importance proportion in the GR, RF, BRT, XGBoost, and SVM models, indicating that these three meteorological factors are the key drivers influencing forest fire occurrences in spring. Moreover, human factors such as Dis_rail or Dis_res were also found to significantly influence spring forest fire occurrences. For the summer and autumn datasets, all models consistently showed that Minhum, Max_temp, and Sun had the highest importance. Additionally, Prec high importance proportions in the LR, GR, RF, BRT, and XGBoost models, further highlighting its role as a key driver. Soil and vegetation factors, such as SM and NDVI, showed significantly higher importance in summer and autumn compared to other seasons, reflecting their critical role in assessing forest fire occurrences during these periods. For the winter dataset, all six models indicated that Minhum and Max_temp had the highest importance proportions. Hum also showed a high proportion of importance in the GR, RF, BRT, XGBoost, and SVM models, suggesting that relative humidity and maximum temperature are the key factors driving winter forest fires. Additionally, Mwin showed a higher importance proportion in the LR, RF, BRT, and XGBoost models during winter compared to other seasons.

It is noteworthy that among the topographic factors, elevation consistently demonstrated a significant influence on forest fire occurrence across different seasons in the LR, GR, RF, BRT, and XGBoost models. Additionally, at least one of the socioeconomic indicators, such as population or GDP, was identified as an important factor in at least one instance across the different models.

Differences in important driving factors among seasons based on optimal prediction models

Further analysis was conducted on the differential changes in the important driving factors that affect the occurrence of forest fires based on the optimal models for different seasons. This study found that there were significant differences in the importance of the driving factors that affect the occurrence of forest fires in different seasons. Overall, meteorological factors accounted for nearly half or more of the total, and were the key driving factors affecting the occurrence of forest fires in the study area.

In spring, Max_temp, Minhum, or Hum are the key driving factors in meteorological factors, followed by Temp_diff, reflecting the importance of these meteorological indicators in influencing the occurrence of spring forest fires. In addition, Dis_rail and Dis_res in human factors also show relatively high importance in influencing the occurrence of spring forest fires, reflecting the relationship between human factors and the occurrence of spring forest fires. In summer, Max_temp and Minhum are still the most critical driving factors affecting the occurrence of forest fires. However, the importance of NDVI and SM in meteorological factors such as Sun and Prec, as well as soil vegetation factors, further increases compared to other seasons, reflecting the significant impact of these driving factors on the occurrence of forest fires in summer. In autumn, SM becomes the most critical driving factor among soil vegetation factors, while driving factors such as Minhum, NDVI, and Prec continue to play an important role in influencing forest fires in autumn. In winter, meteorological factors such as Minhum and Max_temp remain key driving factors affecting the occurrence of forest fires. The importance of human factors such Dis_des and Pop has significantly increased, while the importance of GDP in economic and social factors is significant. These factors reflect the relationship between human economic activities and the occurrence of forest fires in winter. It is worth noting that Mwin is an important driving factor affecting the occurrence of forest fires in winter, indicating that the monsoon also has a significant impact on the occurrence of forest fires.

Mapping forest fire probability

We used key driving factors selected from six models to predict the probability of forest fire occurrence at fire points and random points using forest fire prediction models. Spatial probability analysis of forest fires in Yunnan Province was conducted with the Inverse Distance Weighting (IDW) method in ArcGIS 10.8. Since the probability of forest fire occurrence varies across seasons, probability maps were generated for the entire year and for each of the four seasons within the study area.

The annual forest fire probability map (Fig. 5) shows that areas with a high probability of forest fire occurrence are primarily distributed in the northern and southeastern parts of Lijiang City, the eastern part of Dali Bai Autonomous Prefecture, the northwestern and southwestern parts of Pu’er City, and most of Dai Autonomous Prefecture of Xishuangbanna. Additionally, the LR, GR, and SVM models identified high-probability regions in the southern part of Lincang City, while the RF, BRT, XGBoost, and SVM models also predicted high-probability areas in the southeastern part of Nujiang Lisu Autonomous Prefecture, the northern part of Honghe Hani and Yi Autonomous Prefecture, and the northeastern part of Wenshan Zhuang and Miao Autonomous Prefecture. SVM models also show that there are a small number of areas with high probability of forest fires in the southern part of Yuxi City.

Fig. 5: Probability map of forest fires in Yunnan province.
figure 5

a LR. b GR. c RF. d BRT. e XGBoost. f SVM. Note: This image was created using ArcGIS 10.8.

The probability of forest fire occurrence in Yunnan Province shows significant seasonal differences (Fig. 5). The six models predict that areas with a high probability of forest fire occurrence in spring are primarily distributed in the southeastern part of Lijiang City, the eastern part of Dali Bai Autonomous Prefecture, the southern part of Lincang City, the northwestern and southwestern parts of Pu’er City, the southern part of Yuxi City, and most of Dai Autonomous Prefecture of Xishuangbanna. Additionally, the RF, BRT, XGBoost, and SVM models also indicate high-probability regions in the northern part of Honghe Hani and Yi Autonomous Prefecture and the northeastern part of Wenshan Zhuang and Miao Autonomous Prefecture. The six models predict that areas with a high probability of forest fire occurrence in summer are primarily distributed in the northern part of Dali Bai Autonomous Prefecture. Additionally, the LR, GR, RF, BRT, and XGBoost models consistently indicate high-probability regions in the northwestern and southwestern parts of Pu’er City and the northern part of Dai Autonomous Prefecture of Xishuangbanna. The SVM model further identifies high-probability regions in the northern part of Zhaotong City. The six models predict that areas with a high probability of forest fire occurrence in summer are mainly distributed in the northern part of Dali Bai Autonomous Prefecture. Additionally, the LR, GR, RF, BRT, and XGBoost models consistently indicate high-probability regions in the northwestern and southwestern parts of Pu’er City and the northern part of Dai Autonomous Prefecture of Xishuangbanna. The SVM model also identifies high-probability areas in the northern part of Zhaotong City. In autumn, areas with a high probability of forest fire occurrence, as predicted by all six models, are concentrated in the southwestern part of Pu’er City, most of Dai Autonomous Prefecture of Xishuangbanna, the southeastern part of Honghe Hani and Yi Autonomous Prefecture, and most of Wenshan Zhuang and Miao Autonomous Prefecture. Additionally, the LR, RF, BRT, XGBoost, and SVM models also showed a high probability of forest fires in the southeastern part of Honghe Hani and Yi Autonomous Prefecture and most areas of Wenshan Zhuang and Miao Autonomous Prefecture. In winter, areas with a high probability of forest fire occurrence, as predicted by all six models, are concentrated in the northern and southeastern parts of Lijiang City, the southeastern part of Nujiang Lisu Autonomous Prefecture, the western parts of Dali Bai Autonomous Prefecture, the northwestern and southwestern parts of Pu’er City, most of Dai Autonomous Prefecture of Xishuangbanna, and the northeastern part of Wenshan Zhuang and Miao Autonomous Prefecture. Additionally, the BRT and XGBoost models also identify high-probability regions in the northern part of Honghe Hani and Yi Autonomous Prefecture. In addition, there are some high fire risk areas in the southwest China Myanmar border, the South China Laos border and the southeast China Vietnam border in Yunnan Province, indicating that the entry fire has a certain fire threat to the study area.

Based on the optimal model BRT, the areas with high probability of forest fires in spring and winter are mainly concentrated in the northwest, southwest, and southeast regions of Yunnan Province. The northern and eastern parts of Lijiang City, the eastern part of Dali Bai Autonomous Prefecture, the northwestern and southwestern parts of Pu’er City, the southwestern and southeastern parts of Xishuangbanna Dai Autonomous Prefecture, the northern part of Honghe Hani and Yi Autonomous Prefecture, and the northeastern part of Wenshan Zhuang and Miao Autonomous Prefecture are the main areas with a higher probability of forest fires in spring; The southeast of Nujiang Lisu Autonomous Prefecture, the north and east of Lijiang City, the northwest and south of Pu’er City, the southwest and southeast of Xishuangbanna Dai Autonomous Prefecture, the north of Honghe Hani and Yi Autonomous Prefecture, and the northeast of Wenshan Zhuang and Miao Autonomous Prefecture are the main areas with a high probability of forest fires in winter. Based on the optimal model XGBoost, the areas with high probability of forest fires in summer are mainly concentrated in the northwest and southwest of Yunnan Province, including the northern part of Dali Bai Autonomous Prefecture, the northwestern and southwestern parts of Pu’er City, and the northern part of Xishuangbanna Dai Autonomous Prefecture; Based on the optimal model XGBoost, the areas with high probability of forest fires in autumn are mainly concentrated in the southwest of Pu’er City, most of Xishuangbanna Dai Autonomous Prefecture, the southeast of Honghe Hani and Yi Autonomous Prefecture, and most of Wenshan Zhuang and Miao Autonomous Prefecture. These areas are mainly concentrated in the southwest and southeast. Although the areas with higher probability of forest fires in summer and autumn overlap with some areas with high incidence of fires in spring and winter, they are relatively concentrated in spatial pattern.

Marginal effects of forest fire drivers

Based on the fitting results of the BRT model with the highest prediction accuracy, this study clarifies the relationship between each important driving factor and the probability of forest fires through marginal effects (Fig. 6). The vertical coordinate represents the marginal effect, and the larger the value, the higher the probability of forest fires. Whether in the BRT model or the other five models, Dem is considered to have a significant impact on the occurrence of forest fires. When the altitude increases from 0−500 m, the probability of forest fires shows a significant downward trend, and forest fires are more likely to occur in low-altitude areas. When Minhum and Hum exceed 20%, the probability of forest fires gradually decreases. The three meteorological indicators, namely Max_temp, Temp_diff, and Sun, all show the same development trend in influencing the probability of forest fires. That is, within a certain node, the probability of forest fires remains basically unchanged, but after exceeding that node, the probability of forest fires shows a significant upward trend. For example, when the daily maximum temperature is between 0 and 20 °C, the probability of forest fires fluctuates slightly, but when it exceeds 20 gC, the probability of forest fires shows a significant increasing trend. When Temp_diff is between 0 and 6 °C, the probability of forest fires does not change much, but when it is between 6 and 10 °C, the probability of forest fires gradually decreases, and when it exceeds 10 °C, the probability of forest fires shows a gradual increasing trend. When Sun is between 0 and 6 h, the probability of forest fires does not change much, but when it exceeds 6 h, the probability of forest fires shows a gradually increasing trend. In terms of infrastructure, when Dis_rail is within 0–1 Km, the probability of forest fires gradually decreases. In terms of socioeconomics, when the GDP is within 0–50,000 yuan/km2, the probability of forest fires gradually increases, but when it exceeds 50,000 yuan/km2, the probability of forest fires remains unchanged.

Fig. 6
figure 6

Marginal effects based on the full sample results of the BRT model. Note: This image was created using R studio 4.4.0.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *