Predicting illness takes risks decades ago with AI

AI News


summary

  • Researchers have developed an AI model to estimate long-term disease risk across more than 1,000 medical conditions
  • Models trained and tested with anonymized medical data in the UK and Denmark can predict health outcomes over a decade ago
  • AI models are not ready for direct clinical use, but offer new ways to study diseases and inform medical strategies

Imagine a future that will help you predict what health your medical history will face over the next 20 years. Using large-scale health records, researchers have developed a generative AI model that estimates how human health changes over time. It can predict the risk and timing of over 1,000 diseases and predict health outcomes over 10 years.

This new generation AI model was custom built using algorithmic concepts similar to those used in large-scale language models (LLM). They were trained on anonymous patient data from 400,000 participants from the UK biobank. The researchers also successfully tested the model using data from 1.9 million patients from the National Patient Registry Bureau of Denmark. This approach is one of the most comprehensive demonstrations of how generator AI can model disease progression in large scale in humans, and was tested with data from two completely separate health care systems.

“Our AI model is a proof of concept, indicating that AI can learn many of the long-term health patterns and use this information to generate meaningful predictions.” “By modeling how diseases occur over time, we can begin to explore how specific risks emerge and how it is best to plan early intervention. This is a major step towards a more personalized preventive approach to healthcare.”

This work has been published in the journal NatureEMBL, the German Cancer Research Center (DKFZ), and the University of Copenhagen.

AI for health prediction

Just as large-scale linguistic models can learn the structure of sentences, this AI model learns the “grammar” of health data to model medical history as a series of events that unfold over time. These events include lifestyle factors such as medical diagnosis and smoking. This model learns to predict the risk of illness from the order in which such events occur and how long it will take between these events.

“Medical events often follow predictable patterns,” says Tom Fitzgerald, a staff scientist at EMBL's European Institute for Bioinformatics (EMBL-EBI). “Our AI models can learn these patterns and predict future health outcomes. It gives us a way to explore what will happen based on a person's medical history and other important factors. Importantly, this is not certainty, but an estimate of potential risk.”

This model works particularly well with clear, consistent progression patterns, such as certain types of cancer, heart attacks, and sepsis. This is a type of blood poisoning. However, this model is less reliable in more diverse conditions, such as mental health disorders and pregnancy-related complications relying on unpredictable life events.

Future Use and Limitations

Like weather forecasting, this new AI model offers probability rather than certainty. Although it does not accurately predict what will happen to an individual, it provides a well-adjusted estimate of the likelihood that certain conditions will occur over a specific period of time. For example, you can predict the chances of developing heart disease within the next year. These risks are expressed as rates over time, similar to predicting a 70% chance of tomorrow's rain. In general, predictions over short periods are more accurate than predictions over long distances.

For example, the model predicts different levels of heart attack risk. Taking a UK biobank cohort at age 60-65, the risk of heart attack varies from 4 to 100 to 100 per year in some men, depending on previous diagnosis and lifestyle. Women have a lower risk on average, but there is a similar spread of risk. Additionally, risk increases on average as people age. A systematic assessment of data from UK biobanks not used for training showed that these calculated risks correspond well to the number of cases observed across age and gender groups.

This model has been calibrated to generate accurate population-level risk estimates and predicts the frequency at which certain conditions occur within a group of people. However, just like the AI ​​model, there are limitations. For example, childhood and adolescent health events are underestimated, as model training data from the UK biobanks come primarily from individuals aged 40-60 years. This model also includes demographic bias due to gaps in training data, including underestimation of specific ethnic groups.

The model is not ready for clinical use, but it may already support researchers.

  • Understand how diseases develop and progress over time;
  • Exploring how lifestyle and past illnesses affect the risk of long-term illnesses;
  • In situations where real-world data is difficult to retrieve or access, artificial patient data is used to simulate health outcomes.

In the future, similar AI tools trained with more representative datasets may help clinicians identify high-risk patients early. With an aging population and an increase in chronic disease, predicting future health needs could help improve health system planning and allocate resources more efficiently. However, much more testing, consultation and robust regulatory frameworks are required before AI models can be deployed in a clinical setting.

“This is the beginning of a new way to understand human health and disease progression,” said Moritz Gerstung, head of AI in oncology at DKFZ and former group leader at EMBL-EBI. “Numeric models like ours can help one day personalize care and predict large health care needs. By learning from a large population, these models can provide a powerful lens for how disease is deployed and ultimately support more customized interventions.”

reference: Shmatko A, Jung AW, Gaurav K, and others learn the natural history of human diseases in generative trances. Nature. 2025:1-9. doi:10.1038/s41586-025-09529-3

This article has been republished from: Note: Materials may have been edited for length and content. For more information, please contact the source quoted. You can access the press release publishing policy here.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *