KNOXVILLE, TN, December 11, 2025 /24/7 Press Release/ — Because people spend most of their time indoors, understanding how ozone behaves indoors is critical to assessing human health risks. In this study, we developed the first large-scale machine learning model that can predict hourly indoor ozone (O₃) concentrations using easily accessible predictor variables such as outdoor O₃, weather conditions, and window opening/closing behavior.
Ozone (O₃) is a major air pollutant formed by chemical reactions between nitrogen oxides and volatile organic compounds under sunlight. In 2021, approximately 490,000 people died worldwide due to long-term O₃ exposure. Although most exposure assessments rely on outdoor data, people typically spend 70% to 90% of their time indoors, and ventilation, indoor sources, and building materials all influence actual O₃ levels. Traditional mechanistic models require detailed indoor parameters that are difficult to obtain in large-scale studies, while linear regression models struggle with nonlinear environmental relationships. Because of these limitations, there is an urgent need to develop accurate and scalable models that can predict indoor O₃ exposure based on accessible environmental and behavioral data.
Researchers from Fudan University and the Chinese Academy of Sciences built a machine learning model to predict hourly indoor O₃ levels in 18 Chinese cities. This study was published at (DOI: 10.1016/j.eehl.2025.100170). eco environment and health July 9, 2025, we used a random forest algorithm trained on low-cost sensor measurements combining meteorological and ventilation data. By comparing two models with and without window condition information, the researchers demonstrated that including ventilation behavior significantly improved prediction accuracy, a major step toward more realistic O₃ exposure assessments.
The team collected more than 8,200 hours of indoor O₃ data using portable electrochemical sensors in 23 households. Predictor variables include outdoor O₃ levels (from high-resolution random forest and MERRA-2 datasets), meteorological parameters (temperature, humidity, wind, solar radiation, boundary layer height, surface pressure), and window opening/closing status manually recorded by volunteers. We compared two random forest models. One excludes window status and one includes window status. Incorporating windowing increased the cross-validation R² from 0.80 to 0.83 and decreased the RMSE from 7.89 to 7.21 ppb. The model accurately captured the hourly O₃ variations and regional differences, and performed better in southern China than in northern China and in the cold season than the warm season. Predictor significance analysis showed surface pressure, temperature, and ambient O₃ to be the dominant factors, with ventilation emerging as an important behavioral determinant. Daytime comparisons revealed that indoor O3 concentrations were 40% lower than outdoor levels during the day, highlighting the buffering effect of the indoor environment.
“Most exposure studies still rely on outdoor O₃ data, but that's only half the story,” said Professor Xia Meng, senior author of the study. “Our findings show that simple ventilation behaviors, such as opening or closing a window, can dramatically change exposure. Integrating such behavioral data with weather information through machine learning will ultimately allow us to more accurately estimate indoor O₃ at scale. This will enhance epidemiological studies and guide public health interventions in urban and residential settings.”
In this study, we present a practical and low-cost strategy for real-time prediction of indoor O₃ exposure over a vast geographic area. The model can be integrated into health risk assessments, smart home monitoring systems, and public health surveillance platforms, allowing policymakers and scientists to better understand the differences between indoor and outdoor exposures. Future research could extend this framework to other pollutants such as particulate matter and nitrogen dioxide, incorporate smart sensors for automated window tracking, and extend monitoring to diverse climate zones. Ultimately, this machine learning approach will bridge environmental modeling and daily life, promoting healthier indoor environments in rapidly urbanizing regions.
References
Toi
10.1016/j.eehl.2025.100170
Original source URL
https://doi.org/10.1016/j.eehl.2025.100170
Funding information
This study was funded by the National Natural Science Foundation of China (82003413 and 82030103).
About journals
Eco Environment and Health (EEH)
At Chuanlink Innovations, innovative ideas realize their true potential. Our name is rooted in the essence of communicating and connecting, and reflects our commitment to fostering innovation and facilitating the journey of ideas from conception to realization.
Related links:
http://chuanlink-innovations.com
# # #
