AI models improve accuracy of global flood predictions

Machine Learning


Scientists are increasingly focusing on improving global river flow forecasts to enhance both flood prediction and sustainable water resource management. Maria Luisa Taccari, Kenza Tazi, and Oisín M. Morrison, in collaboration with colleagues from the European Center for Medium-Range Weather Forecasts (ECMWF) in Reading and Bonn, Germany, presented AIFL (for Floods), a new deterministic LSTM-based model for global daily flow forecasts. This study addresses a critical gap in current data-driven models, which often exhibit performance degradation when moving from historical data to real-time predictions. AIFL has clearly bridged this domain shift from reanalysis to forecasting by adopting a two-step training strategy of initial pre-training on 40 years of ERA5-Land reanalysis followed by fine-tuning on operational Integrated Forecasting System (IFS) forecasts, establishing a transparent and reproducible new baseline within the CARAVAN ecosystem, and achieving high forecasting skills and competitive accuracy against existing global systems.

Global flood forecasting is now entering a new era of accuracy and reliability. AIFL provides a single, consistent flow forecast for approximately 19,000 river basins around the world, providing transparent and readily available tools to improve water management and disaster preparedness. Scientists are developing new methods to improve global river flow forecasting, an essential feature of water resource management and flood preparedness.

Data-driven models often perform poorly when moving from research environments to real-world operational forecasting, but this challenge is addressed by a new deterministic model called AIFL (Artificial Intelligence for Floods) with a unique training approach. Trained with data from 18,588 river basins in the CARAVAN dataset, AIFL aims to bridge the gap between historical simulations and the uncertainties inherent in live weather forecasts.

Achieving accurate predictions over large geographic areas requires overcoming the limitations of both traditional physically-based hydrological models and new machine learning techniques. Traditional models struggle to represent the complexity of water flow and rely on high-quality climate data, but many machine learning approaches lack the ability to accurately translate past data into future predictions.

AIFL employs a two-step training strategy, first learning from 40 years of ERA5-Land reanalysis data to establish a deep understanding of hydrological processes. Independent tests using data from 2021 to 2024 revealed high predictive skill with a median modified Kling-Gupta efficiency (KGE’) of 0.66 and a median Nash-Sutcliffe efficiency (NSE) of 0.53.

The model exhibits particular strength in identifying extreme events and provides a reliable baseline for the global hydrological community. Accurately predicting peak flows is essential to an effective flood warning system, allowing communities time to prepare and mitigate damage. By combining the strengths of both historical data analysis and real-time predictive adaptation, AIFL provides a rational and operationally sound approach to global river flow forecasting, which has the potential to enhance disaster preparedness and water management strategies around the world, especially as climate change exacerbates hydrological extremes.

Pre-training with reanalysis and fine-tuning with operational forecasts improve flow forecast accuracy

A two-step training strategy supported the development of AIFL, a deterministic LSTM-based model for global daily flow forecasting. Initially, the model was pre-trained on 40 years of ERA5-Land reanalysis data spanning 1980 to 2019 to establish a strong foundation for understanding fundamental hydrological processes. By adopting a deterministic LSTM architecture, the work moved away from probabilistic approaches and focused on producing a single best-estimate discharge forecast for each basin.

This methodology bridges the gap between historical reanalysis data and real-time forecast products, unlike many existing global hydrological models that rely on complex physics-based frameworks. LSTM networks learn directly from the relationships observed in the training data, rather than carefully tuning parameters to a geophysical map.

Forcing data, including meteorological variables, are incorporated into the model along with static attributes such as topography and land cover, allowing AIFL to implicitly capture their influence on flow without explicit parameterization. This work represents the first global model trained end-to-end within the CARAVAN ecosystem, providing a streamlined and reproducible forcing pipeline. The research team aimed to create a baseline model that could be used by the broader hydrological community, prioritizing operational robustness and the ability to reliably detect extreme events.

Global streamflow prediction and extreme event detection using AIFL

Achieving a median Modified Kling-Gupta Efficiency (KGE’) of 0.66, AIFL recorded a median Nash-Sutcliffe Efficiency (NSE) of 0.53 over the independent test period spanning 2021-2024, demonstrating substantial forecast skill in global daily flow forecasts. These values ​​demonstrate the strong ability of KGE to measure overall similarity and NSE to evaluate error variance to accurately simulate observed flows. reduction.

Notably, these scores were calculated across 18,588 river basins, highlighting the wide applicability of the model. AIFL’s performance exceeds overall accuracy and reveals exceptional reliability in detecting extreme events, a critical feature for effective flood preparedness. This model operates deterministically and provides a consistent and predictable baseline for hydrological predictions.

A two-step training strategy successfully bridged the gap between historical reanalysis data and operational forecasting products. Benchmarks show that AIFL is competitive with existing state-of-the-art global systems and provides a transparent and reproducible enforcement pipeline that facilitates validation and adaptation. AIFL’s LSTM-based approach provides a streamlined alternative for global flow prediction rather than relying on complex parameterizations.

Bridging the gap between hydrological simulation and accurate flood forecasting

Scientists are inching closer to reliable global flood predictions, but this feat has long been hampered by the difficulty of translating theoretical model performance to real-world accuracy. For many years, hydrologic models have excelled at simulating past events but have struggled to predict future events due to mismatches between training data and incomplete operational forecast information.

In this study, we introduce AIFL, which directly addresses this “domain shift from reanalysis to prediction” through a two-step training process. Rather than simply fitting historical data, the developers first established a strong foundation for understanding hydrological processes and adapted it to the specific biases present in current weather prediction systems.

Achieving predictive skill across the planet is a considerable task, and model performance is not uniform everywhere. Forecasting in data-poor areas remains a challenge, as accuracy naturally varies depending on local data quality and availability. A focus on transparent and reproducible enforcement pipelines is particularly valuable, allowing other researchers to scrutinize and improve their methodologies.

The impact extends beyond improved forecasting and could provide a path to more reliable early warning systems and reduce the devastating effects of flooding on vulnerable communities. Unlike previous global models within this ecosystem, this model is built end-to-end and provides a streamlined approach. This method could also be applied to other environmental forecasting challenges, such as drought prediction and water resource management. Although this model provides a strong baseline, future work should consider incorporating additional data sources such as satellite imagery and ground sensors to further refine the predictive capabilities.

👉 More information
🗞 AIFL: Global daily flow prediction model using deterministic LSTM pre-trained on ERA5-Land and fine-tuned on IFS.
🧠ArXiv: https://arxiv.org/abs/2602.16579



Source link