Details of the main steps of our proposed methodology to model urban air pollution using ML methods and assessing the importance of the spatial and temporal resolutions in modeling are shown in Fig. 2.

The study’s primary workflow diagram/Flowchart created by the authors using Microsoft Paint.
Phase one: assessing the effect of temporal resolution in modeling accuracy
Data source
-
Ground Measurement: Daily data for average PM10, PM2.5, NOx from 2017/03/21 to 2018/03/20 were downloaded from Tehran Air Quality Control Company. The missing and outlier values were examined, removed, and calculated in the preprocessing stage. These pollutants were selected as key indicators of urban air quality due to their well-documented health impacts and prevalence in vehicular and industrial emissions. The data consisted of numeric values in micrograms per cubic meter (µg/m3), recorded as daily averages. Using daily temporal resolution allows the model to detect short-term fluctuations and capture daily exposure levels relevant for public health assessments. The missing and outlier values were examined, removed, and calculated in the preprocessing stage.
-
Meteorological Parameters: The meteorological data of the Naderi monitoring station were obtained as a time series in the mentioned period by the I.R of Iran Meteorological Organization. These parameters included temperature (°C), humidity (%), wind speed (m/s), wind direction (degrees), pressure (hPa), and rainfall (mm). All variables were collected as daily numeric values, aligned temporally with the pollution data to ensure accurate input representation for modeling. The selection of these parameters is based on their influence on pollutant dispersion and accumulation processes, which are critical for accurate prediction. Regarding the wind direction parameter, according to Eqs. 14 and 15, the data undergoes a transformation from non-linear to linear56.
$$\:\varvec{S}\varvec{i}\varvec{n}\:\varvec{W}\varvec{D}=\frac{\varvec{sin}\left(2\varvec{\pi\:}\left(\varvec{v}-\varvec{min}\left(\varvec{v}\right)\right)\right)}{\varvec{max}\left(\varvec{v}\right)-\varvec{min}\left(\varvec{v}\right)}$$
(8)
$$\:\varvec{C}\varvec{o}\varvec{s}\:\varvec{W}\varvec{D}=\:\frac{\mathbf{c}\mathbf{o}\mathbf{s}\left(2\varvec{\pi\:}\left(\varvec{v}-\mathbf{min}\left(\varvec{v}\right)\right)\right)}{\mathbf{max}\left(\varvec{v}\right)-\mathbf{m}\mathbf{i}\mathbf{n}\left(\varvec{v}\right)}$$
(9)
In these two equations, SinWD and CosWD are the linearized results of the wind direction parameter (WD), while v is the wind direction temporal parameter.
Identifying empty records and outlier values and replacing them using the KNN algorithm
Missing and outlier data points were identified using pollutant-specific AQI thresholds, with values above 500 considered outliers57. Missing and outlier values were imputed using the K-Nearest Neighbors (KNN) method. The neighborhood size (k) was optimized by testing values between 1 and 19, with the Root Mean Square Error (RMSE) of a Random Forest model trained on the imputed data used to evaluate performance. The RMSE results for different k values are summarized in Table 1. Optimal k values were determined as 11 for PM10, 9 for PM2.5, and 5 for NOx.
Autocorrelation of pollutants concentration
This procedure reveals that the concentration level of each pollutant is influenced by values from several days earlier. An AR model is used to accomplish this. This model can be used to examine a time series in which a value is influenced by several preceding time points. To identify this relationship, statistical metrics such as BIC and AIC are used. Various degrees of AR are then calculated, and the degree with the lowest BIC and AIC values is chosen as the optimal degree and value. In Table 2, the AIC and BIC values for various pollutants in degrees 1 to 13 of the AR model are presented. To better understand the temporal dependencies of each pollutant, autocorrelation function (ACF) plots were generated (see Fig. 3). These plots reveal how strongly current values of pollutant concentrations are correlated with their past values over a range of lags (in days).
For PM10, the research determined that the best degree of the AR model is AR(3). This result is based on the lowest AIC value of 4191.28 and the lowest BIC value of 4207.85 for the third-degree AR model. This shows that using pollutant concentrations from the previous three days yields the most accurate temporal modeling for PM10. The large decline in AIC and BIC values at AR(3) emphasizes the significance of using short-term historical data to adequately capture temporal relationships. The ACF plot for PM10 illustrates how the concentration levels of this pollutant are influenced by previous days’ values. The plot shows a significant autocorrelation at lag 1, 2, and 3, after which the autocorrelation rapidly decreases and falls within the confidence interval, indicating that values beyond the third lag have minimal influence on current concentrations. This pattern supports the selection of AR(3) as the optimal model, as it captures the short-term temporal dependencies effectively. The ACF plot thus confirms that incorporating data from the previous three days provides the most meaningful predictive power for PM10 levels.
For PM2.5, the best degree of the AR model is AR(6), as evidenced by the lowest AIC value of 3716.82 and the lowest BIC value of 3728.52. This study implies that PM2.5 concentrations are best predicted using data from the previous six days. When compared to PM10, PM2.5 has a longer temporal dependency, which may indicate the pollutant’s behavior and interaction with numerous environmental and meteorological elements throughout time. Incorporating a six-day historical background improves the model’s understanding and prediction of PM2.5 level fluctuations. The ACF plot of PM2.5 reveals significant autocorrelations up to lag 6, after which the values drop within the confidence bounds. This confirms that the temporal influence of prior values extends over a longer duration for PM2.5 compared to PM10. The sustained autocorrelation pattern up to six lags supports the selection of AR(6) as the most suitable model. Thus, the ACF visualization validates that incorporating data from the past six days improves the model’s ability to capture the pollutant’s delayed response to environmental and anthropogenic factors.
In the case of NOx, the study revealed AR(4) as the best degree, with the lowest AIC value of 2559.41 and the lowest BIC value of 2570.87. This suggests that using the previous four days of data to anticipate NOx amounts is the most accurate method. The use of AR(4) emphasizes the importance of accounting for short- to medium-term historical data in order to accurately reflect the temporal dynamics of NOx pollution. The model’s capacity to employ a four-day historical window aids comprehension of temporal patterns and swings in NOx levels, which are frequently impacted by factors such as traffic patterns and industrial activity. This information from the table is used in the modeling stages, where the data is prepared based on the correlation they have with their previous days. This finding is further supported by the ACF plot of NOx, which shows significant autocorrelation up to lag 4. Beyond this point, the autocorrelation values fall within the confidence bounds, confirming the appropriateness of using a four-day historical window in modeling. The plot visually reinforces the statistical selection of AR(4) as the optimal temporal dependency structure for NOx concentrations.

ACF plots for PM10, PM2.5, and NOx pollutants//plots created by the authors using Python 3.9.
Temporal modeling and prediction
In this study, we employed daily averages of meteorological parameters and historical pollution data to predict PM10, PM2.5, and NOx at the temporal level, taking into account two unique scenarios. For this phase, only MLP and RF models were applied, as they are well-suited for handling lagged features derived from autoregressive modeling. In contrast, models such as SVM and DT were not used in the temporal modeling phase due to their lower adaptability to autoregressive time-series structures. Specifically, SVM often requires extensive hyperparameter tuning to effectively handle lagged inputs, while DT tends to overfit when faced with temporally dependent features. Thus, selecting MLP and RF helped ensure both computational efficiency and alignment with the core objective of assessing temporal autocorrelation effects71,72,73. In the first scenario, we assumed no autocorrelation, which meant that past pollutant values have no impact on current predictions, with a daily temporal resolution. In the second scenario, we accounted for autocorrelation by predicting present concentrations using prior values of pollutants and meteorological conditions, while also acknowledging the influence of historical data. This resulted in temporal resolutions of 3 days for PM10, 6 days for PM2.5, and 4 days for NOx, respectively, to capture the best temporal dependencies for each pollutant.
Scenario I: Temporal modeling and prediction of pollutants concentration without considering autocorrelation of air pollutants.
At this stage, modeling and prediction of pollutants are investigated without taking autocorrelation into account. Using daily averages of meteorological data over a year, we employed a MLP and the RF algorithm to predict pollutant concentrations. The dataset, spanning one year of daily averaged meteorological and pollutant data (365 days), was split into training and testing sets. Specifically, 75% of the data (approximately 274 days) were used to train the models, while the remaining 25% (approximately 91 days) were reserved as the test set for evaluating model performance. This temporal split was performed sequentially to avoid any data leakage from future to past observations. The models are designed to predict daily pollutant concentrations based solely on the meteorological parameters available for each specific day. Therefore, as long as meteorological input data are available, the model can provide predictions for any future day without being limited to a predefined forecasting horizon. The proposed MLP for each pollutant consists of eight input neurons and one output neuron. Finally, statistical parameters such as RMSE, MAE, and R2 are calculated to evaluate the model. Following data preparation, the aforementioned models are formed, and the modeling process is carried out. For PM10 prediction (without autocorrelation), the MLP model consisted of two hidden layers with 12 neurons each and ReLU activation functions. The input dimension was 7, and the model was trained for 200 epochs using the Adam optimizer with mean squared error (MSE) as the loss function. For PM2.5, the MLP had two hidden layers with 12 and 15 neurons respectively, trained for 300 epochs. NOx, the MLP had three hidden layers with 25 neurons each and 8 input features, trained for 150 epochs. The Random Forest models had 668 trees for PM10; 60 trees for PM2.5; and 64 trees for NOx. All RF models used default settings for maximum tree depth and bootstrap sampling.
The number of neurons, layers, and epochs were selected based on trial-and-error and grid search to minimize the validation error. The input dimensions correspond to selected meteorological and pollution variables. Epoch values varied depending on convergence behavior of the models.
Table 3 displays the modeling results for the PM10, PM2.5, and NOx pollutant using the MLP and RF methods. In Fig. 4 the regression relationship between actual values and modeled values is illustrated for various pollutants using different methods.

Scatter plots showing observed and predicted Pollutants without considering autocorrelation/plots created by the authors using Python 3.9.
According to the Table 3; Fig. 3, the MLP and RF models have varying levels of efficacy for predicting PM10, PM2.5, and NOx. The relatively low predictive accuracy for PM10 and PM2.5 may be attributed to the high temporal variability of air pollutant concentrations, their complex non-linear relationships with meteorological variables, and possibly the limited temporal length of the dataset, which may not have captured all seasonal patterns adequately. For PM10, the RF model outperformed the MLP model, with an RMSE of 86.134 µg/m3 vs. 92.564 µg/m3 and an MAE of 52.770 µg/m3 vs. 54.165 µg/m3. This shows that RF was better at capturing complicated patterns in PM10 data. For PM2.5, the MLP model achieved a MAE of 25.807 µg/m3 and an RMSE of 61.096 µg/m3, whereas the RF model obtained an MAE of 21.342 µg/m3 and an RMSE of 53.284 µg/m3, showing that RF proved better for PM2.5 predictions. In contrast, the MLP model performed better for NOx, with an MAE of 5.770 µg/m3 and an RMSE of 7.905 µg/m3, than the RF model, which had an MAE of 5.877 µg/m3 and an RMSE of 8.482 µg/m3. This indicates that the MLP model outperformed the RF model in forecasting NOx concentrations, probably due to its ability to capture nonlinear interactions more efficiently in this situation. Also, the scatter plots in Fig. 4 for PM10 and PM2.5 demonstrate that the RF model makes predictions that are closer to the true values, as indicated by the tighter clustering of points along the line of perfect agreement. For NOx, the scatter plot shows that the MLP model has a modest advantage over the RF model, with predictions that are more closely related to the observed values. These visual representations provide further support for the quantitative findings in Table 3, demonstrating the relative strengths and shortcomings of each model in forecasting different contaminants.
Scenario II: Temporal modeling and prediction of pollutants concentration considering autocorrelation among pollutants.
In this stage, the data is generated based on the outcomes of the AR model, and modeling is performed using the MLP and RF methods. The proposed MLP for PM10 pollutant consists of eleven input neurons, for PM2.5 pollutant consists of fourteen input neurons, for NOx pollutant consists of twelve input neurons, and one output neurons. Statistical metrics such as RMSE, MAE, and R2 are calculated to evaluate the models’ performance.
In Scenario II, the same one-year dataset of daily averaged meteorological and pollutant data (365 days) was used. A temporal train-test split was applied sequentially, with 75% of the data (approximately 274 days) allocated for training and 25% (approximately 91 days) for testing. The lag values used (3 days for PM10, 6 days for PM2.5, and 4 days for NOx) were determined based on the results of the AR model to optimally capture temporal dependencies. It is important to clarify that the lagged inputs correspond exclusively to the target pollutant concentrations, and not to the meteorological variables. For each prediction, the model uses the pollutant concentrations from previous days (as determined by the selected lag) along with the meteorological parameters of the day being predicted. Therefore, as long as these input features are available, the model is capable of generating predictions for any subsequent day, without being limited to a fixed forecast horizon.
In this scenario, temporal autocorrelation was incorporated by adding lagged features of both target pollutants and meteorological variables. This increased the number of input features accordingly (from 7 to 9 or 11). For PM10, the MLP model consisted of two hidden layers with 16 neurons each and 9 input features. The model was trained for 2000 epochs. For PM2.5, a network with 14 and 16 neurons in two layers was used, with 11 input features. The NOx MLP model had two hidden layers with 18 and 19 neurons respectively and 10 inputs. For the Random Forest models, the number of trees was set to 55 for PM10, 59 for PM2.5, and 64 for NOx. Hyperparameters were selected based on grid search and validation performance. The lag selection was guided by AR model, and the inclusion of such lagged inputs significantly improved the model’s ability to capture temporal patterns in pollutant concentrations.
Table 4 illustrates the findings for PM10, PM2.5, and NOx, demonstrating the efficacy of each modeling technique when autocorrelation is taken into account. In Fig. 4 the regression relationship between actual values and modeled values is illustrated for various pollutants using different methods.

Scatter plots showing observed and predicted pollutants with considering autocorrelation/plots created by the authors using Python 3.9.
The results of Table 4 demonstrate that using autocorrelation improves predicted accuracy for PM10 and PM2.5 but not for NOx. For PM10, the MLP model produced an RMSE of 68.590 µg/m3 and an MAE of 35.569 µg/m3, indicating a considerable improvement over the scenario without autocorrelation. The RF model also improved for PM10, but the MLP model fared better overall. Similarly, for PM2.5, the MLP model outperformed the RF model with an RMSE of 37.299 µg/m3 and MAE of 18.479 µg/m3. These findings show the necessity of taking temporal dependencies into consideration for enhancing the accuracy of particulate matter air pollution estimates. In contrast, adding autocorrelation to the NOx model did not improve performance. The MLP model had an RMSE of 12.104 µg/m3 and an MAE of 7.747 µg/m3, but the RF model had a higher RMSE of 13.346 µg/m3 and MAE of 8.016 µg/m3. These data indicate that predicting patterns for NOx are less impacted by previous values than for PM10 and PM2.5, implying that other variables may play a more major role in determining NOx levels.
The scatter plots in Fig. 5 for PM10 and PM2.5 reveal that the MLP model’s predictions are highly matched with actual values, as seen by the tight clustering of dots around the line of perfect agreement. This visual evidence backs up the quantitative improvements observed in the RMSE and MAE values. For NOx, the scatter plots show a less obvious improvement, with a more dispersed distribution of dots around the line of perfect agreement, indicating that autocorrelation has a limited influence on NOx predictions. The relatively low predictive accuracy for NOx in this scenario may be due to its higher sensitivity to sudden changes in traffic volume, short-term industrial activities, or emission sources that are not captured by lagged meteorological or pollutant variables. This suggests that additional explanatory factors may be necessary to improve the model’s performance for NOx. The increase in error for NOx in Scenario II can be attributed to the fact that NOx emissions are often influenced more heavily by abrupt fluctuations caused by short-term traffic patterns or sudden meteorological changes, which are not effectively captured by temporal autocorrelation. Unlike particulate matter, NOx concentrations exhibit more abrupt variations, limiting the usefulness of historical data for accurate prediction.
Phase two: assessing the effect of different spatial resolution in accuracy of models
Data source
-
Emission Inventory: The annual emission inventory of the city of Ahvaz was obtained from 2017/03/21 to 2018/03/20, which included industries, oil facilities, power plants, and mobile resources (The Iranian Department of Environment). PM and NOx pollutants were considered to evaluate the effects of spatial resolutions on their prediction, and CO pollutant is also required for initial modeling. These data represent quantitative annual emissions (kg/year) distributed as point or polygon features in GIS format. Various spatial aggregation levels (500 m, 750 m, 1000 m) were generated to assess resolution impact. The emission inventory was selected for its detailed sector-specific data, which serves as a critical input for pollution source modeling and spatial analysis. Figure 6 illustrates distribution of the contaminant sources in the Ahvaz.
-
MODIS AOD (Aerosol Optical Depth): The Terra and Aqua MODIS C6 daily 1-km MCD19A2 products generated by the MAIAC algorithm were used in the present research. These satellite data are raster-based, unitless optical thickness values representing atmospheric aerosol content. The 1-km spatial resolution was retained to align with the high-resolution modeling requirements, and the mean AOD data for the study period were computed using Google Earth Engine (GEE). In this study, the MODIS MCD19A2 product, which includes data from both Terra (AM) and Aqua (PM) satellites, was used. To generate daily AOD values, all available granules within each day over the study region were filtered and their pixel-wise median was computed using the Google Earth Engine platform. This compositing approach reduces the impact of cloud contamination and missing data, providing a robust daily aerosol optical depth estimate. AOD was selected due to its strong correlation with ground-level particulate matter concentrations, offering an indirect but spatially comprehensive pollution indicator.
-
Meteorological Parameters: The meteorological data were obtained on an annual average in the mentioned time period by the I.R of Iran Meteorological Organization, including temperature, humidity, wind speed, wind direction, pressure, and rainfall. These parameters were interpolated using the inverse distance weighting (IDW) method to produce continuous raster maps. The maps were generated at varying spatial resolutions (500 m, 750 m, 1000 m) to match the model scenarios. All values are numeric and represent long-term average conditions, which are essential for evaluating spatial dispersion behavior of pollutants. The inclusion of meteorological rasters at multiple resolutions allows the model to capture the influence of spatial scale in physical dispersion dynamics.
-
Auxiliary Data: Auxiliary data, including the digital elevation model (DEM), was utilized to determine the elevation of each point in order to improve the accuracy of the models. The DEM used in this study had a spatial resolution of 30 m and was resampled as needed to match modeling resolutions. Elevation data play a role in influencing air movement and pollutant accumulation patterns, especially in areas with complex terrain, and thus serve as an important spatial input feature.

Distribution of the Industries, oil facilities and power plants in the territory of Ahvaz/the authors created the maps using ArcGIS 10.8.1 software (https://support.esri.com).
Spatial modeling and prediction
The PM and NOx pollutants were determined for modeling and prediction in this section. According to Ahvaz city’s emission inventory, the majority of PM emissions (66%) are generated by stationary sources, primarily industries, with power plants accounting for 31%. Similarly, power plants account for 63% of stationary NOx emissions, with industries contributing 33% (The Iranian Department of Environment). As a result, only industrial facilities, power plants, and oil units were considered for modeling among stationary sources in this study, along with mobile sources. Furthermore, because of the impact of sources outside the city on air pollution within the city, these sources were investigated as well.
Dispersion Modeling.
As previously stated, the main objective is to investigate the significance of spatial resolution in the dispersion of PM and NOx pollutants within cities, taking into account factories, oil facilities, power plants, and mobile sources. It is critical to understand that each pollution source has a particular radius of contamination after dispersion, necessitating the calculation of pollution quantities prior to modeling pollutant concentrations from all sources.
For this purpose, an MLP model with five parameters, including temperature, wind speed, wind direction, CO pollutant concentration, and AOD, is utilized for each pollution source. This model seeks to calculate the amount of pollution spread within a given radius. Temperature, wind speed, and wind direction are all variables to consider since they have a significant impact on the spread of contaminants throughout the city. Additionally, incorporating CO pollutant data is intended to improve modeling accuracy due to its significant association with PM and NOx pollutants. To justify the inclusion of CO as an input feature, Pearson correlation analysis was conducted. The results showed a moderate to strong positive correlation between CO and both PM (r = 0.56) and NOx (r = 0.63), indicating that CO levels can serve as a useful proxy for modeling these pollutants.
Given the similarity in pollutant concentrations and the comparable radius for contaminant dispersion, a unified model is used for industries and oil facilities. For PM modeling, the architecture consisted of three hidden layers with 12 neurons each, all using ReLU activation functions. Due to the relatively small dataset size, LOOCV was applied to ensure robust performance evaluation. The model was trained for 200 epochs using the Adam optimizer and MSE as the loss function.
For NOx prediction in the same domain, a shallower MLP was employed, consisting of two hidden layers with 17 and 15 neurons, respectively. This model was trained for 200 epochs using the full training data, optimized via Adam with MSE as the loss.
The depth of the PM model was selected based on its improved convergence behavior during cross-validation. All models used 6 input features representing relevant meteorological and operational parameters. Standardization was performed prior to training to facilitate convergence and reduce gradient instability.
Table 5 presents the initial modeling results for PM and NOx pollutants from industries and oil facilities. In Fig. 7 the observed and predicted concentrations of PM and NOx are compared based on the initial modeling.
The findings of Table 5 demonstrate that the MLP model accurately forecasted pollutant concentrations in both PM and NOx emissions from industrial and oil facilities. The MLP model for PM showed an R2 of 0.86. These values show a good level of precision in capturing the dispersion patterns of PM from various sources. The MLP model accurately models NOx dispersion, with an R2 of 0.81.

Scatter plots of observed and predicted PM and NOx for industries and oil facilities, using MLP/plots created by the authors using Python 3.9.
The scatter plots (Fig. 7) for PM demonstrate a tight clustering of points around the line of perfect agreement, which supports the statistical measures’ excellent prediction accuracy. This graphic depiction demonstrates that the MLP model accurately captures the dispersion dynamics of PM emissions from industrial and oil sites. The scatter plots for NOx provide a high agreement between observed and anticipated values, but with slightly greater fluctuation than PM. Despite this, the points remain tightly packed around the line of perfect agreement, demonstrating the model’s overall accuracy in forecasting NOx dispersion. The slight spread in the NOx scatter figure indicates that, while the MLP model is robust, there may be additional factors driving NOx dispersion that require more exploration.
For modeling pollutant dispersion from mobile sources (Cars, Taxies, Buses, Motorcycles), separate MLP architectures were designed for PM and NOx. The PM model consisted of three hidden layers with 10, 50, and 50 neurons, respectively, each using ReLU activation. The model was trained for 200 epochs using the Adam optimizer with MSE loss. For NOx prediction, a deeper architecture was used, consisting of four hidden layers with 17, 20, 40, and 50 neurons, also using ReLU activation. This model was trained for 200 epochs as well, with the same optimizer and loss function.
Both models used 6 input features, representing a combination of traffic-related and meteorological variables. The architecture depth and neuron counts were determined through empirical tuning to optimize validation performance. All input data were standardized before training.
The results are reported in Table 6; Fig. 8 illustrates the comparison of observed and predicted concentrations of PM and NOx.
The findings present that the MLP model accurately predicted pollutant concentrations for both PM and NOx emissions from mobile sources. The MLP model accurately captured PM dispersion patterns from various sources, with an R2 of 0.88. The MLP model accurately predicted NOx dispersion from mobile sources, with an R2 of 0.89.

Scatter plots of observed and predicted PM and NOx for mobility sources, using MLP/plots created by the authors using Python 3.9.
The scatter plots in Fig. 8 for PM demonstrate a tight clustering of points around the line of perfect agreement, which also supports the statistical measures’ excellent prediction accuracy. The scatter plots for NOx demonstrate a high agreement between observed and anticipated values, but with slightly greater fluctuation than PM. Despite this, the points remain tightly packed around the line of perfect agreement, demonstrating the model’s overall accuracy in forecasting NOx dispersion.
Due to the limited number of power plants (only two), the model cannot be implemented. In this scenario, the pollution produced by the power plants is deemed to be equivalent to the amount of pollution distributed within the impact radius.
Computing the Pollution Dispersion.
Following the construction of the model, the extent of pollution dispersion within the impact radius for each pollution source must be determined. The distance over which pollutants emitted from a source are dispersed is defined as the impact radius. The impact radius is determined using ArcGIS. Following the determination of the impact radius’s extent, regular grids with resolutions of 500 m, 750 m, and 1000 m are formed using GIS within the specified impact radius.
The impact radius for industries and oil facilities is set to be 3 kilometers58,59. Following the determination of the impact radius for industries and oil facilities, model parameters such as temperature, wind speed, wind direction, and CO concentration are calculated within the specified grids with the specified resolutions. Figure 9 depicts the Grid 1000 m created for industries and oil facilities. Following that, these parameters are fed into the model to calculate the concentrations of PM and NOx. within three grids.

Grid created in the specific radius at a resolution of 1000 m for industries and oil facilities/The authors created the maps using ArcGIS 10.8.1 software (https://support.esri.com).
The impact radius for mobile sources is considered to be 250 meters60. After computing the model’s parameter, they are inputted into the model, calculating PM and NOx pollutant concentrations from mobility sources across three grids. Figure 10 depicts the Grid 750 m created for mobility sources. It should be noted, Pollution from mobile sources is regarded as a point source in order to facilitate the ultimate integration of pollutant layers from various sources.

Grid created in the specific radius at a resolution of 750 m for mobility sources/The authors created the maps using ArcGIS 10.8.1 software (https://support.esri.com).
The impact radius of power plants is estimated to be around 5 Km61. Based on this assumption, it is assumed that the pollutant produced by power plants remains constant and equal within the impact radius. It is critical to note that pollutant dispersion effects are not limited to the impact radius. However, because the model only addresses a portion of the overall problem, this study assumes that pollutant impacts are restricted to the defined impact radius.
Integration of different pollutant maps
Previously, the concentrations of PM and NOx pollutants originating from diverse sources were determined using the resolutions specified (500 m, 750 m, 1000 m). During this stage, it is imperative to overlay separate pollution layers at resolutions that are relevant to each pollutant. By following this procedure, pollutant contributions from the utilized sources are accounted for in the final model. Figure 11 depicts how to combine different layers at a resolution 500 m. The resulted map presents the concentrations of PM and NOx pollutants from all sources in each pixel. Then, meteorological data such as temperature, humidity, wind speed, wind direction, air pressure, rainfall, altitude, and geographical coordinates are determined at these points. The same combination process is applied to overlay this pollutant in other grids, and a similar combination is performed for the NOx pollutant as well.

Integration of PM pollutants from various resources (spatial resolution of 500 m)/The authors created the maps using ArcGIS 10.8.1 software (https://support.esri.com).
Final modeling and determining better Spatial resolutions
After the preparation of the layers, final grid-based modeling of PM and NOx concentrations at spatial resolutions of 1000 m, 750 m, and 500 m, different MLP architectures were designed based on data complexity and spatial resolution. For PM at 1000 m resolution, the MLP model consisted of five hidden layers with 50 neurons each, using ReLU activations. At 750 m, a slightly deeper network was employed with six hidden layers (40 neurons), while for 500 m, the model used four layers with 60 neurons each to prevent overfitting due to increased spatial granularity. For NOx, the 1000 m model included five layers with 50 neurons, whereas the 750 m and 500 m models used four and three layers respectively (55 neurons), tuned to account for varying emission dispersion patterns at different scales. All models were trained using the Adam optimizer and MSE as the loss function over 200 epochs, with 10 input features.
The model architectures were selected based on iterative trial-and-error combined with preliminary grid search to minimize validation error. Additionally, model complexity was adjusted in relation to spatial resolution and data volume to prevent overfitting and improve generalization.
The modeling results presented in Table 7 were obtained by splitting the dataset into training and testing subsets. The models were trained on the training data and evaluated on the testing set without applying cross-validation. In Fig. 12, Observed and predicted concentrations of PM and NOx were compared by final modeling.
According to the Table 7, for PM at a 500 m resolution, the MLP model has an RMSE of 19.779 kg/year, indicating great accuracy in capturing dispersion patterns. At a resolution of 750 m, the model’s RMSE was 19.324 kg/year, showing a slight performance decline. At 1000 m resolution, the model has an RMSE of 13.509 kg/year. These measures demonstrate the model’s ability to properly describe the dispersion patterns of PM in residential environments at various resolution levels. As a result, the PM pollutant performs better at coarser spatial resolutions (larger grid sizes). Given the problem’s data structure, the modeling accuracy for PM pollutants improves as data resolution increases. In contrast, the RMSE for NOx at a 500 m resolution was 307.496 kg/year, according to the MLP model. This metric is quite accurate in characterizing NOx dispersion patterns at this finer resolution. At 750 m resolution, the model had an RMSE of 461.505 kg/year, showing that performance decreased somewhat as resolution rose. At 1000 m resolution, the model produced an RMSE of 568.380 kg/year, indicating that predicted accuracy decreases with coarser resolution. As a result, for NOx pollutants, lower resolutions improve modeling accuracy. So, when the problem parameters are taken into account, NOx pollutants perform better at lower resolutions.

Scatter plots of observed and predicted PM and NOx in the grids with different spatial resolutions/plots created by the authors using Python 3.9.
The scatter plots of Fig. 12 for PM, at 1000 m resolution, reveal a significant alignment between observed and anticipated values, with points tightly placed around the line of perfect agreement, confirming the excellent prediction accuracy. The scatter plots at the resolutions of 500 m and 750 m likewise show satisfactory alignment, but with slightly more fluctuation than the 1000 m resolution. The scatter plots for NOx at 500 m resolution show a tight clustering of dots around the line of perfect agreement, indicating great forecast accuracy based on statistical measures. As the resolution increases to 750 m and 1000 m, the scatter plots for NOx show increased unpredictability and a wider dispersion around the line of perfect agreement, demonstrating that model performance degrades at coarser resolutions.
After developing the models and validating the findings, the spatial distribution map of PM concentration in Grid 1000 m and NOx concentration in Grid 500 is shown, in Fig. 13, Although the modeling was performed at 1000 m for PM and 500 m for NOx, the maps are presented at a 100-meter resolution to give a clearer and more thorough depiction.This greater resolution improves the visual understanding of spatial patterns and pollutant dispersion, making it simpler to detect places with higher concentrations and better comprehend the distribution.

PM and NOx Concentration distribution map in the study area/The authors created the maps using ArcGIS 10.8.1 software (https://support.esri.com).
Figure 13 demonstrates that PM concentrations are highest in the northern section of the research region, with values falling as one moves south. The locations with the greatest concentrations are shown in red, indicating major emission sources in the northern region, most likely due to industrial activity. Also, this figure displays the regional distribution of NOx concentrations. Similar to PM, the greatest NOx concentrations were found in the northern section of the research area, as indicated by the red zones. This trend implies that the majority of NOx emissions come from industrial or high-traffic metropolitan regions in the north. Concentration values drop in the southern areas of the research region, which are highlighted in yellow to indicate lower emission levels.
Accuracy assessment of different ML methods
At this point, we compare various methods for PM pollutants in the 1000 m resolution grid and NOx in the 500 m resolution grid in the final modeling. The approaches being compared are MLP, RF, DT, and SVR. All models were tuned based on trial-and-error and pollutant characteristics. MLP models used 4–6 hidden layers with 25–50 neurons each and ReLU activation; epochs ranged from 150 to 250 depending on convergence. RF models included 60–120 trees with default depth settings. DT models had a maximum depth of 10–15 to prevent overfitting. SVR models used an RBF kernel with optimized C and gamma values. Model settings were adapted to each pollutant and spatial resolution to ensure fair comparison. Table 8 shows the results of PM pollutant modeling using the aforementioned methods.
According to the Table 8, the MLP model outperformed other models for PM at 1000 m resolution, with a R2 of 0.90, MAE of 5.037 kg/year, and RMSE of 13.509 kg/year, showing strong prediction accuracy. The DT model had a slightly lower R2 of 0.81, but produced a lower MAE of 3.881 kg/year compared to MLP, indicating higher average error performance. However, its larger RMSE of 18.824 kg/year shows more unpredictability in predictions than MLP. The RF model demonstrated lower performance than MLP and DT, with a R2 of 0.56, MAE of 8.653 kg/year, and RMSE of 27.040 kg/year. The SVR model performed poorly again, with a R2 of 0.02, MAE of 13.994 kg/year, and RMSE of 43.578 kg/year. At a 500 m resolution for NOx, the MLP model obtained the greatest R2 of 0.90, suggesting good prediction accuracy, with an MAE of 50.528 kg/year and an RMSE of 307.496 kg/year. Despite a strong R2 value, the large RMSE indicates significant fluctuation in NOx concentrations, most likely due to the dataset’s high NOx values. The DT model has an R2 of 0.81, a lower MAE of 44.707 kg/year than MLP, but a higher RMSE of 417.542 kg/year. This suggests that, while the DT model performed better on average errors, it suffered more with bigger variances in NOx values. The RF model performed well, with a R2 of 0.86, MAE of 62.405 kg/year, and RMSE of 367.494 kg/year. The SVR model proved unsuccessful, with a negative R2 of −0.13, MAE of 348.696 kg/year, and RMSE of 1014.953 kg/year.
As a result, the MLP model has the highest predicted accuracy for both NOx and PM. However, it is worth noting that the DT model had higher MAE values for both pollutants, implying that its predictions were, on average, closer to real values. The higher RMSE values for DT models suggest that, while their average predictions were better, they had more mistakes on specific predictions than MLP. When it deals with NOx, the model performs an excellent task of capturing the general trend, but it has trouble with high levels of variability in NOx concentrations, as indicated by the high R2 value and high RMSE for the MLP model. Larger forecast errors may result from the dataset’s intrinsically high NOx values, which is the source of this unpredictability. This emphasizes how important it is to take into account both RMSE and MAE when assessing the effectiveness of a model since they offer distinct viewpoints on prediction accuracy.
Dispersion model performance
In some cases, CO pollutant concentrations are unavailable, necessitating modeling based solely on meteorological data. For preliminary dispersion modeling without CO data, MLP models used 3–4 hidden layers with 20–40 neurons each and ReLU activation. Input features included temperature, wind speed, wind direction, and AOD. Models were trained for 150–200 epochs with Adam optimizer minimizing MSE. These configurations were selected based on iterative tuning to balance accuracy and training time.
The proposed MLP models for PM and NOx pollutants consist of four input neurons and one output neuron. The results of dispersion modeling of PM and NOx pollutants from stationary and mobile sources are shown in Table 9.
Based on the results of modeling without CO pollutants, it can be inferred that this assumption is valid in certain situations, given the acceptable R2 and RMSE values. As a result, even when the CO concentration is unknown, modeling can still be done using meteorological and AOD data.
