Study area
Krasnoyarsk Krai is a vast and ecologically diverse region located in Siberia, Russia. It spans over 2.3 million square kilometers, making it one of the largest administrative divisions in the country. The region’s landscape is characterized by a mix of taiga forests, mountain ranges, river valleys, and vast plains. The Krai is home to a variety of ecosystems, from boreal forests to tundra, depending on the geographical area. The forested regions of Krasnoyarsk Krai are prominent, with coniferous forests (mainly pine, spruce, and fir) dominating the central and northern parts, while mixed forests, including deciduous trees such as birch and aspen, are found in the southern regions. These forests regulate the local climate, maintaining biodiversity, and supporting the livelihoods of local communities. Figure 1 shows the aftermath of a wildfire, highlighting the impact of such fires on the region’s coniferous forests.

Burned forest in the northern taiga near Lake Lama on the Putorana Plateau, Krasnoyarsk Krai.
The forests of Krasnoyarsk Krai are a key part of the Russian forest zone, one of the largest forested areas in the world. These ecosystems are rich in biodiversity and are vital for carbon sequestration, acting as natural climate regulators. However, they also face threats from wildfires, especially in the summer months, when dry conditions, heatwaves, and strong winds create an ideal environment for fire spread. The region’s fire-prone forests, coupled with human activities and climatic factors, contribute to the high frequency and intensity of wildfires.
Figure 2 shows annual wildfire distribution from 2010 to 2024 years. In years with higher wildfire counts–such as 2012, 2019, 2018, and 2017 – extended dry periods, high summer temperatures, and low precipitation contributed to drier vegetation, making forests more susceptible to ignition. Additionally, increased lightning activity during thunderstorms and higher levels of human-caused ignition events could have further elevated fire numbers in these years.
Conversely, lower wildfire counts in years like 2024, 2021, and 2023 may indicate periods with more favorable moisture conditions. Increased precipitation, lower temperatures, or shorter fire seasons would result in higher vegetation moisture and reduced fuel availability, thereby decreasing the likelihood of fires 18. Effective fire management and prevention strategies implemented during these periods might also have contributed to the reduction in fire occurrences 19.

Annual wildfire occurrences in Krasnoyarsk Krai.
Figure 3 shows the distribution of wildfire occurrences based on the distance to the nearest town, categorized by detection method. The histogram represents the number of fires detected within 150 km of a town, with colors indicating different detection sources. Data smoothing was applied to remove outliers. The most common detection methods include satellite observations, ground patrols, and aerial surveillance.

Annual wildfire occurrences in Krasnoyarsk Krai.
Fire size statistics

Distribution of wildfire areas in Krasnoyarsk Krai across six fire size classes (A–F), stratified by fire type. Each panel represents a distinct size category ranging from very small fires (Class A: < 0.2 ha) to extreme fires (Class F: >2000 ha), with fire types including surface fires (weak, medium, strong), underground fires, and crown fires. The histograms reveal shifting fire-type compositions and frequency distributions as fire size increases–from predominantly weak surface fires in smaller classes to a higher prevalence of crown and strong surface fires in larger classes.
Wildfire classification by burn area helps assess severity and response needs. Class A (\(\le\)0.2 ha) includes small ignitions that are easily controlled. Class B (0.2–2 ha) covers minor fires with limited impact. Class C (2–20 ha) consists of moderate fires requiring firefighting efforts. Class D (20–200 ha) represents large fires that are difficult to contain. Class E (200–2000 ha) includes major wildfires causing damage. Class F (>2000 ha) refers to catastrophic fires 20 with widespread destruction. Higher classes demand more resources and pose greater environmental and societal risks.
The distribution of wildfire sizes (Fig. 4) in Krasnoyarsk Krai from 2010 to 2024 reveals distinct patterns influenced by climate, vegetation, and fire behavior dynamics. The majority of fires fall into the small-scale categories (Classes A and B, 0–2 ha), dominated by weak-to-medium surface fires. These are often caused by localized ignition sources such as human activity (discarded cigarettes, agricultural burns, equipment sparks) or natural events like lightning strikes in dry conditions 21. Due to their limited fuel load and quick suppression efforts, they rarely escalate into larger fires. Some underground fires appear in Class B, in peatland areas, where smoldering fires can persist for long periods despite their small initial size.
As fire size increases, the intensity and type of fire change significantly. Moderate fires (Class C, 2–20 ha, and Class D, 20–200 ha) show a notable rise in strong surface and crown fires, suggesting that fires of this scale often occur in dense boreal forests where dry needles, fallen branches, and undergrowth provide extensive fuel. Crown fires, which involve the burning of tree canopies, become more frequent in these classes, indicating more extreme fire behavior driven by prolonged droughts, strong winds, and high temperatures 22. These fires spread rapidly, often affecting large areas before containment efforts can take place.
The largest fires, categorized as Class E (200–2000 ha) and Class F (>2000 ha, up to 164,144 ha), represent the most destructive wildfire events, primarily consisting of high-intensity surface and crown fires. These fires are relatively rare, with fewer than 100 occurrences over the 14-year period, yet they contribute disproportionately to total burned area. The driving factors behind these massive fires include prolonged summer droughts, strong winds that facilitate rapid fire spread, and the vast 23, continuous stretches of forest typical of central and eastern Krasnoyarsk Krai. Unlike smaller fires, which may burn for only a few hours or days, Class E and F fires can last for weeks or even months, often requiring large-scale firefighting operations.

Spatial distribution of wildfires in central part of Krasnoyarsk Krai during May–August 2023 and 2024, with fires classified by size.
A key factor influencing the escalation of fires into these larger categories is fuel accumulation 24 In remote forested areas with limited fire suppression, decades of deadwood buildup, thick undergrowth, and dry organic material create conditions where a single ignition source can lead to an uncontrollable wildfire. Climate change exacerbates this risk by increasing the frequency of heatwaves and reducing precipitation, making forests drier for longer periods. Additionally, strong wind patterns play a crucial role in the spread of crown fires, allowing embers to ignite new fires kilometers away from the original source.
The fire size distribution highlights the growing threat of megafires 25 (Classes E and F) driven by climate extremes, while small-to-moderate fires (Classes A to D) remain frequent due to human activity and localized weather conditions. The increasing frequency of larger, more intense fires suggests that fire management strategies must adapt to account for longer fire seasons, reduced water availability, and the rising impact of climate change on Siberian forests.
Figure 5 shows the spatial distribution of wildfires in Krasnoyarsk Krai during 2023–2024, with clear clustering patterns that align with climatic gradients, dominant vegetation zones, and proximity to anthropogenic infrastructure. The highest concentration of fires is observed between \(56^\circ\)N and \(60^\circ\)N, corresponding to regions characterized by dense boreal forests (taiga) 26. These areas are dominated by coniferous species such as Siberian larch (Larix sibirica) 27, Scots pine (Pinus sylvestris) 28, and Siberian spruce (Picea obovata) 29, which are known for high flammability due to resinous litter and canopy structure. Additionally, fire density is elevated near transport corridors and logged areas, as seen through overlay analysis with road networks 30 and land use data (not shown), suggesting a combined effect of natural and anthropogenic factors. These forests provide a fuel load due to the accumulation of dry needles, fallen branches, and peat-rich soils, making them highly flammable, especially in dry conditions. The central and eastern parts of Krasnoyarsk Krai are vulnerable to large-scale wildfires (Class E and F fires), often driven by lightning strikes, prolonged summer droughts, and strong winds that facilitate rapid fire spread 31,32.
In contrast, northern areas (above \(60^\circ\)N) experience fewer fires due to colder temperatures, higher soil moisture, and the presence of permafrost. This region is characterized by sparse larch-dominated forests mixed with tundra landscapes, where wet and frozen ground limits fire propagation. When fires do occur, they are typically small (Class A and B) and slow-moving, affecting surface vegetation rather than the deep forest canopy. However, increasing temperatures in recent years may be reducing permafrost stability, potentially leading to more frequent fires in these areas in the future.
In the southern regions (\(52-56^\circ\)N), the landscape transitions from dense taiga forests to mixed and deciduous forests, featuring birch (Betula pendula), aspen (Populus tremula), and pine stands interspersed with grasslands 33. These areas experience more frequent but smaller fires (Class A, B, and C), often linked to human activities such as agriculture, logging, and settlement expansion. Due to higher population density, ignition sources such as discarded cigarettes, uncontrolled burning, and equipment sparks highly impact in fire outbreaks. Unlike the north, where fires are primarily natural, southern fires are often anthropogenic.
Seasonally, fires follow a latitudinal shift, starting in May–June in the south, where early-season dryness increases flammability, before spreading northward by July–August, when peak summer heat reduces humidity and dries out vegetation. The largest fires occur in mid-to-late summer in the central and eastern boreal forests, where drought, high temperatures, and low precipitation create ideal fire conditions 34.
The mapping results highlight the connection between climate, vegetation type, and fire risk. The central taiga forests are most prone to large-scale fires due to their high biomass and susceptibility to drought, while northern tundra and permafrost regions remain largely fire-resistant but could become more vulnerable with climate change. Meanwhile, southern forests and grasslands experience more frequent human-caused fires, though they are generally smaller in size. Given the increasing trend of warmer summers and drier conditions, wildfire risk in Krasnoyarsk Krai is expected to rise, in mid-latitude forested areas where fire suppression remains a challenge due to vast, inaccessible terrain 35.
Climatic factors
The analysis of climate conditions (Fig. 6) prior one month to wildfires in Krasnoyarsk Krai (2010–2024) reveals trends and correlations between meteorological variables and fire occurrences. Data aggregated over the month preceding each of the 17,921 fires in the region highlight key environmental factors contributing to fire risk. The minimum temperature at 2 meters varied between \(-15^\circ\)C and \(+20^\circ\)C, while the maximum temperature ranged from \(-10^\circ\)C to \(+30^\circ\)C. Fires were most frequent when minimum temperatures were around \(5-15^\circ\)C and maximum temperatures reached \(20-25^\circ\)C. Dew point temperatures ranged from \(-15^\circ\)C to \(15^\circ\)C, with peak fire occurrences corresponding to \(5-10^\circ\)C 36. Specific humidity values showed that fires were more frequent when humidity was between 5 and 10 g/kg. Relative humidity varied from 50% to 100%, with the highest number of fires occurring at around 60–70%. Fires were most common when wind speeds at 10m ranged from 2–5 m/s, indicating that moderate winds may facilitate fire spread 37. Surface pressure varied between 80–100 kPa, with a noticeable clustering of fires around 90–95 kPa. Fires were more frequent when surface shortwave downward flux was between 10–25 \(\text {W/m}^{2}\), suggesting a strong correlation between solar radiation and fire risk 38. Cloud cover ranged from 40% to 90%, with fire numbers peaking when cloud cover was around 50–60%. Corrected total precipitation in the month before fires was typically low, often below 5 mm, while snow depth was minimal, usually under 20 cm. This indicates that fires were more likely in drier periods 39. Soil moisture percentages ranged from 30% to 100%, with most fires occurring when soil moisture was below 50%, reinforcing the role of dry conditions in fire susceptibility 40. Higher temperatures, especially above \(20^\circ\)C, increase fuel dryness and enhance the likelihood of ignition 18. The strong correlation between maximum temperatures of \(20-25^\circ\)C and fire frequency aligns with the critical threshold at which vegetation loses moisture and becomes highly flammable 41. The fact that fires peak when relative humidity is between 60–70% suggests that extreme dryness is not always necessary for ignition 42. Moderate humidity levels may sustain flammability while preventing the full saturation of fuels. Specific humidity in the 5–10 g/kg range also aligns with conditions where fine fuels remain dry enough to burn despite some atmospheric moisture 43. Moderate winds (2–5 m/s) provide enough airflow to oxygenate fires and spread embers without extinguishing flames, which explains why fires are more frequent in these conditions. Higher wind speeds could lead to rapid fire spread, while very low wind speeds might suppress fire activity 44. Increased solar radiation (10–25 \(\text {W/m}^{2}\)) contributes to surface heating and drying of vegetation, which in turn elevates fire risk 45. Fires are more common under partly cloudy conditions (50–60%), likely because full cloud cover reduces solar heating, while completely clear conditions might allow too much moisture loss, making fuels less flammable. Snow depth under 20 cm suggests that fires predominantly occur in periods where snow has either melted or failed to accumulate significantly 46. Fires occur most frequently when soil moisture is below 50%, reinforcing the idea that drought stress on vegetation directly influence in fire susceptibility 47. Drier soils also contribute to water stress in plants, making them more prone to burning 48. The data clearly illustrate that wildfires in Krasnoyarsk Krai are strongly influenced by a combination of high temperatures (above \(20^\circ\)C), low precipitation (<5 mm), moderate winds (2–5 m/s), and dry soils (<50% moisture). These findings suggest that climate trends, warming temperatures and shifts in precipitation patterns, could exacerbate wildfire risks in the future. As temperatures continue to rise, the frequency and intensity of wildfires may increase, posing challenges for fire management and disaster prevention strategies. Understanding these climatic influences can help in developing targeted mitigation measures, such as improved early warning systems, forest management practices, and policies aimed at reducing fire-prone conditions. Future research should focus on long-term climate projections and their potential impacts on wildfire behavior to better prepare for and mitigate the risks associated with climate-driven fire activity in the region.

Histograms of averaged climatic conditions during the month preceding wildfire ignition events in Krasnoyarsk Krai (2010–2024). Each panel shows the distribution of a specific environmental variable–such as minimum/maximum temperature, dew point, humidity (specific and relative), wind speed and direction, surface pressure, soil moisture, cloud cover, UV index, solar radiation, precipitation, snow depth, and surface temperature–averaged over the 30 days prior to each recorded fire. These distributions illustrate the typical pre-fire atmospheric and surface conditions.
Fire indices

Fire indexes distribution.
Analysis of fire indices (Fig. 7) for locations where wildfires were recorded, based on climatic data from the month prior to each fire, reveals distinct trends in fire behavior and risk factors. The indices examined include the Duff Moisture Code 49 (DMC), the Fire Weather Index 50 (FWI), and the Nesterov Index 51, each quantifying fire risk based on different meteorological and fuel moisture conditions.
DMC represents the moisture content of decomposing organic layers in the forest floor, with higher values indicating drier conditions. It is calculated as:
$$\begin{aligned} \text {DMC} = DMC_{\text {previous}} + \left( \frac{0.92D}{1 + 0.0015DMC_{\text {previous}}} \right) \end{aligned}$$
(1)
where:
-
\(DMC_{\text {previous}}\) is the previous day’s DMC value,
-
D is the daily temperature-dependent drying factor, which increases with temperature and decreases with precipitation.
Most wildfires occurred in areas where DMC values ranged from 1.0 to 3.0, suggesting that fires were more likely when the duff layer was relatively dry but not completely desiccated. These conditions favored smoldering combustion and fire spread through surface fuels.
FWI measures fire intensity potential and is derived from three sub-indices: the Fine Fuel Moisture Code (FFMC), DMC, and the Drought Code (DC). The FWI itself is calculated as:
$$\begin{aligned} \text {FWI} = 2.72 \times \left( \frac{ISI}{1 + e^{0.05039(DC – 80)}} \right) \end{aligned}$$
(2)
where:
-
ISI (Initial Spread Index) is a function of wind speed and fine fuel moisture,
-
DC (Drought Code) reflects long-term drying effects,
-
e is the base of the natural logarithm.
Wildfires were concentrated in regions with FWI values between 5 and 35, indicating moderate-to-high fire danger conditions. Even at lower values (below 10), fires were still recorded, suggesting that ignition sources and fuel availability are impact, even when fire intensity was not extreme.
The Nesterov Index, used in Russia to assess fire hazard, is calculated as:
$$\begin{aligned} N = \sum (T_{\text {max}} – T_{\text {dew}}) P \end{aligned}$$
(3)
where:
-
\(T_{\text {max}}\) is the daily maximum temperature (\(^\circ\)C),
-
\(T_{\text {dew}}\) is the dew point temperature (\(^\circ\)C),
-
P is a precipitation factor, set to zero if precipitation exceeds 3 mm, otherwise 1.
Wildfires were predominantly recorded in areas where Nesterov Index values ranged from 2,000 to 10,000, reflecting extended dry periods that increased vegetation flammability. This range suggests that while extreme dryness enhances fire likelihood, fires can also occur under moderate dryness if additional factors like wind or human activity contribute to ignition.
Fires occurred more frequently in areas with moderate-to-high DMC and FWI values, confirming that fuel dryness and atmospheric conditions influence wildfire development. Even when FWI was low, fires were recorded, indicating the importance of localized ignition sources and land-use activities in fire occurrences. The Nesterov Index data suggests that while prolonged dryness increases fire risk, moderate conditions can still lead to wildfires under the right conditions, such as human activity, wind, or fuel accumulation.
Machine learning methods
A suite of ML algorithms was employed to model wildfire occurrence and classify fire sizes. XGBoost (Extreme Gradient Boosting) is an ensemble method based on gradient boosting decision trees, known for its high predictive accuracy, ability to capture nonlinear relationships, and built-in regularization 52. Random Forest, another ensemble-based method, constructs multiple decision trees using bootstrapped samples and random feature selection, providing strong performance in noisy data and reducing the risk of overfitting–a frequent concern in ecological modeling 53. K-Nearest Neighbors (KNN) is a non-parametric algorithm that classifies observations based on proximity in feature space, offering intuitive modeling of localized patterns in meteorological and spatial data 54, though it can be sensitive to data scaling and high dimensionality. Logistic Regression serves as a baseline linear classifier, useful for assessing the influence of individual predictors under the assumption of linearity 55. Decision Trees, while more prone to overfitting than their ensemble counterparts, provide interpretable hierarchical decision rules that are valuable for exploratory analysis and identifying key thresholds in fire-related variables 56.
To quantitatively assess fire size \(S\) based on these monthly climatic data, one approach is to model \(S\) using a multiple linear regression:
$$\begin{aligned} S = \alpha + \beta _1\,\overline{T}_{\min } + \beta _2\,\overline{T}_{\max } + \beta _3\,\overline{T}_{\text {dew}} + \beta _4\,\overline{H}_{\text {spec}} + \beta _5\,\overline{H}_{\text {rel}} + \beta _6\,\overline{W} + \beta _7\,\overline{P} + \beta _8\,\overline{I}_{\text {sw}} + \beta _9\,\overline{CC} + \beta _{10}\,\overline{SM} + \varepsilon , \end{aligned}$$
(4)
where:
-
\(\overline{T}_{\min }\) and \(\overline{T}_{\max }\) denote the average minimum and maximum temperatures (\(^\circ\)C) over the month,
-
\(\overline{T}_{\text {dew}}\) is the average dew point temperature (\(^\circ\)C),
-
\(\overline{H}_{\text {spec}}\) and \(\overline{H}_{\text {rel}}\) are the average specific humidity (g/kg) and relative humidity (%) respectively,
-
\(\overline{W}\) is the average wind speed (m/s),
-
\(\overline{P}\) represents precipitation (mm),
-
\(\overline{I}_{\text {sw}}\) is the average surface shortwave downward flux (W/m\(^2\)),
-
\(\overline{CC}\) is the average cloud cover (%),
-
\(\overline{SM}\) is the average soil moisture (%),
-
\(\alpha\) is the intercept, \(\beta _i\) are the regression coefficients, and \(\varepsilon\) is the error term.
Alternatively, a fire risk index (FRI) was constructed to encapsulate the combined effect of key climatic factors on fuel dryness and ignition potential:
$$\begin{aligned} FRI = \gamma \cdot \left( \frac{\overline{T}_{\max } – \overline{T}_{\min }}{\overline{P} + \delta } \right) \cdot \overline{W} \cdot \left( 1 – \frac{\overline{SM}}{100}\right) , \end{aligned}$$
(5)
where:
-
\(\gamma\) is a scaling constant,
-
\(\delta\) is a small constant to avoid division by zero,
-
\(\overline{T}_{\max } – \overline{T}_{\min }\) represents the diurnal temperature range,
-
\(\frac{1}{\overline{P} + \delta }\) reflects the impact of low precipitation on fuel dryness,
-
\(\overline{W}\) is the average wind speed (enhancing oxygen supply and ember dispersal),
-
\(1 – \frac{\overline{SM}}{100}\) quantifies the effect of reduced soil moisture.
For classification tasks (e.g., small vs. large fires), logistic regression can estimate the probability \(P(\text {Large Fire})\):
$$\begin{aligned} P(\text {Large Fire}) = \frac{1}{1 + \exp \Bigl (-\Bigl (\theta _0 + \theta _1\,\overline{T}_{\max } + \theta _2\,\overline{H}_{\text {rel}} + \theta _3\,\overline{W} + \theta _4\,\overline{P} + \theta _5\,\overline{SM}\Bigr )\Bigr )}, \end{aligned}$$
(6)
where \(\theta _0\) is the intercept and \(\theta _i\) are the coefficients associated with the corresponding climatic factors.
For machine learning approaches, XGBoost was employed to predict fire size \(S\), minimizing the following objective function:
$$\begin{aligned} \mathscr {L} = \sum _{i=1}^N L(S_i, \hat{S}_i) + \sum _{k=1}^K \Omega (f_k), \end{aligned}$$
(7)
where:
-
\(S_i\) denotes the observed fire size for the \(i\)-th sample,
-
\(\hat{S}_i\) is the predicted fire size,
-
\(L\) is a differentiable loss function (e.g., mean squared error),
-
\(\Omega (f_k)\) is a regularization term penalizing complexity,
-
\(N\) is the number of samples, and \(K\) is the number of trees in the ensemble.
To interpret model predictions, SHAP (SHapley Additive exPlanations) analysis was conducted, decomposing predictions as:
$$\begin{aligned} f(x) = f_{\text {base}} + \sum _{i=1}^{n} \phi _i, \end{aligned}$$
(8)
where:
-
\(f_{\text {base}}\) is the mean model prediction over training data,
-
\(\phi _i\) represents the contribution of the \(i\)-th feature,
-
\(n\) is the number of features.
This framework, integrating statistical models and machine learning, enhances our understanding of wildfire dynamics, supporting targeted mitigation strategies.
Proposed approach
This study develops a wildfire classification approach incorporating climatic conditions, forest structure, proximity to urban areas, and detection methods. The dataset combines multiple sources: atmospheric and surface parameters from the National Aeronautics and Space Administration (NASA) 57, wildfire occurrence records from the Krasnoyarsk Krai State Autonomous Institution “Lesopozharny Center” (KGAU) 58, and vegetation data from the regional Ministry of Natural Resources 59. Table 1 summarizes the key characteristics of each dataset, including spatial and temporal resolution, coverage period, and source access details
The classification pipeline follows a structured workflow, progressing from data preprocessing to model interpretation, which shown in Fig. 8. To capture both local and landscape-level drivers of each fire event, meteorological and biophysical predictors were aggregated around the fire location at six nested spatial scales (A–F). Meteorological variables were sourced from the NASA POWER reanalysis at approximately \(0.5^\circ\) (\(\tilde{5}5\) km) grid resolution and temporally averaged over the 30 days preceding each fire’s recorded date. For each fire centroid, circular buffers of radius 1 km (Scale A), 5 km (Scale B), 10 km (Scale C), 20 km (Scale D), 50 km (Scale E), and 100 km (Scale F) were generated. Within each buffer, NASA-derived continuous variables (e.g., temperature, precipitation, soil moisture) were summarized using mean and standard deviation–recognizing that the 55 km grid resolution may limit representation of sub-buffer heterogeneity–while categorical predictors (e.g., land-cover type, detection method) were expressed as percent cover. Static features such as distance to roads and dominant vegetation class were extracted from the latest map layers. This multiscale, spatiotemporal aggregation ensures that each event is characterized by a comprehensive feature vector reflecting both immediate site conditions and broader environmental context.

Workflow of the proposed approach.
During preprocessing, missing values are addressed, outliers are filtered, and relevant features are normalized. Meteorological variables such as temperature, humidity, wind speed, and precipitation are sourced from NASA datasets. Fire characteristics, including area burned, detection method, and suppression details, originate from KGAU reports. The Ministry’s data provides forest composition metrics, helping define dominant vegetation types that impact fire behavior. The feature engineering step includes selecting relevant factors such as meteorological conditions, vegetation type, detection method, and proximity to urban areas. The dataset is split into training and testing subsets (typically 70/30) to evaluate model performance.
Several machine learning models are applied to classify wildfires by size and risk category, each chosen based on its strengths in handling different types of data and classification tasks. XGBoost is included due to its high performance on structured data, its ability to handle missing values, and its robustness against overfitting through regularization60. Random Forest is used for its ensemble learning approach, which improves classification accuracy by averaging multiple decision trees and capturing complex feature interactions 61. K-Nearest Neighbors (KNN) serves as a non-parametric baseline model, effective for capturing local patterns in the data but requiring careful tuning to avoid sensitivity to noise 62. Logistic Regression is applied as a simple interpretable model that establishes baseline classification performance, particularly for distinguishing between smaller and larger fires 63. Decision Tree is selected for its transparency and ability to model non-linear relationships, serving as a foundation for more complex ensemble methods 64. The combination of these models allows for a comparative analysis, ensuring that the final classification approach balances accuracy, interpretability, and computational efficiency.
The models are evaluated using accuracy, precision, and recall, \(F_1\) score. XGBoost demonstrates the highest performance and is further analyzed for feature importance using SHAP. This allows for an in-depth assessment of the contribution of individual variables, such as climatic conditions and forest composition, to fire classification.
The final pipeline can be used for applications, such as fire management, early warning systems, and resource allocation. By accurately classifying wildfires, forest management agencies can prioritize areas for prevention or mitigation measures.
