People are exposed to thousands of chemicals every day through the products they use, the foods they eat, and the environment they live in, but only a small percentage of these chemicals have been thoroughly tested for safety.
Researchers at Texas A&M College of Veterinary and Biomedical Sciences (VMBS) are turning to artificial intelligence to fill that gap, using new tools to predict chemical toxicity and determine how reliable those predictions are.
The study builds on a recent study published in Nature Communications that investigated how artificial intelligence can predict chemical toxicity and also estimated the reliability of its predictions.
Dr. Weihsueh Chiu, a professor in VMBS’s Department of Veterinary Physiology and Pharmacology, is leading efforts to advance these tools and apply them to better understand chemical safety and risks.
“With the artificial intelligence tools we are developing, we now have a way to estimate which exposure levels are less likely to cause harm,” Chiu said. “These tools can play an important role in regulatory decision-making, helping regulators identify which substances require further testing, stricter regulation, or removal from the market.”
Long-standing problems in toxicology
Traditionally, to determine whether chemicals are safe, scientists have relied on animal experiments and human epidemiology studies, which track how chemicals affect humans over time, but both are slow, expensive, and limited in scope.
“With rodents, we don’t have enough time or resources to test everything,” Chiu says. “In human studies, by the time we see the effects, people are already sick.”
This creates a large data gap between the number of chemicals on the market and those with reliable safety data, leaving many substances largely unstudied.
To address this, researchers have spent the past decade developing machine learning models (known as quantitative structure-activity relationship (QSAR) models) that use a chemical’s structure to estimate safe exposure levels.
However, while these models can generate predictions, one major limitation is transparency.
Many traditional systems operate as “black boxes”, producing answers without explaining how they arrived at those answers, making them difficult for regulators and scientists to trust.
Chiu has previously helped address this problem through a two-stage machine learning framework designed to make predictions more interpretable.
Specifically, rather than relying on abstract molecular descriptors, the model uses well-known real-world properties such as water solubility, biodegradability, and toxicity indicators to determine how these properties influence potential health effects.
This approach allows risk assessors to better understand not just what the predictions are, but why they were made.
Key innovation: knowing what you don’t know
More recently, Chiu and his colleagues expanded this work to include so-called “uncertainty-aware” machine learning, an approach that estimates the confidence of each prediction.
“We want these machine learning models to not only predict numbers, but also to indicate how confident they are in their predictions,” Chiu said. “Their reliability depends on how much existing data the model has to extract from.
“Predictions are more reliable when similar chemicals have been studied, but become more uncertain when data are limited,” he said. “This will help researchers identify which chemicals require close attention.”
For example, two chemicals may appear to be equally toxic on paper, but one’s predictions may be far less certain, meaning the potential risk may be much higher.
“Just because two chemicals have the same predictions doesn’t mean they have the same worst-case risks,” Chiu said.
To capture this, the model generates different possible outcomes for each prediction and indicates how certain or uncertain the outcome is.
Applying these models to more than 126,000 chemicals revealed important patterns not only in toxicity but also in uncertainty.
Certain chemical groups, such as metals, polychloride compounds, and PFAS, exhibited higher levels of uncertainty, often due to data or complex chemical behavior that is difficult to model.
“These insights will help us identify where further research is needed and where to focus our efforts,” Chiu said.
This approach allows scientists to systematically identify where the largest knowledge gaps exist across the chemical landscape, rather than chasing the latest chemicals of concern.
From prediction to decision making
For researchers, using machine learning to identify safe and unsafe substances is only part of the solution. Through uncertainty estimation, researchers can also determine when human expertise is still needed.
Chiu described this as a phased approach. Use AI for large-scale screening, withholding expert review for high-risk or high-uncertainty cases.
Although challenges remain, including limited data and reliance on previously conducted animal-based studies, the integration of AI represents an important step forward.
As these tools continue to evolve, they have the potential to fundamentally change the way scientists and regulators approach chemical safety, moving from reactive testing to proactive prediction.
Written by Kamryn Haynes, Texas A&M University College of Veterinary Biomedical Sciences
Disclaimer: AAAS and EurekAlert! We are not responsible for the accuracy of news releases posted on EurekAlert! Use of Information by Contributing Institutions or via the EurekAlert System.
