By combining oxidative stress biology and advanced machine learning, researchers show how a simple blood-based antioxidant measurement can significantly improve prediabetes risk prediction and support earlier, more targeted prevention strategies.

Research: Artificial intelligence models as tools to predict prediabetes. Image credit: CI Photos / Shutterstock
In a recent study published in the journal scientific reportResearchers have developed a pattern neural network (PNN) model that combines a new measure of total antioxidant status with traditional metrics to improve the prediction of prediabetes among adults in India. The PNN outperformed a support vector machine, a k-nearest neighbor model, and a logistic regression model trained on the same dataset, achieving 98.3% accuracy. According to model-derived feature importance, waist circumference and antioxidant status showed the strongest predictive power, and BMI also contributed significantly to classification performance.
Growing need to accurately detect prediabetes
Prediabetes is an important early stage characterized by elevated blood sugar levels that has not yet progressed to diabetes. Each year, about 5 to 10 percent of people with prediabetes develop diabetes, but a similar percentage return to normal blood sugar levels. Because progression is inevitable, early detection is essential to prevent type 2 diabetes and its associated long-term complications.
Traditional diagnostic approaches rely on blood-based tests and clinical assessments, but these methods are time-consuming and expensive, and in some cases can limit the ability to predict an individual's risk. As data-driven tools advance, AI allows researchers to combine data from multiple sources and is emerging as a promising alternative for early disease detection.
AI-based predictive models offer multiple benefits, including higher diagnostic accuracy, personalized risk profiles, and early intervention. These advances have the potential to prevent disease progression and significantly reduce healthcare costs.
Integrating oxidative stress markers into AI models
Researchers used real-world clinical data from Indian adults to develop an AI model specifically optimized for predicting prediabetes. Unlike previous studies, the researchers aimed not only to identify the most accurate model, but also one that closely matched clinically relevant biomarkers, such as oxidative stress indicators, that may reflect the underlying pathophysiology.
This pilot study enrolled 199 adults aged 18 to 60 years, classified as prediabetic (n = 100) or healthy controls (n = 99) based on glycated hemoglobin (HbA1c) levels. After an overnight fast, 6 mL of peripheral blood was collected. Biochemical testing includes measurement of HbA1c, fasting blood glucose, and lipid profile using standardized enzyme assays. High-density lipoprotein (HDL), low-density lipoprotein (LDL), and very low-density lipoprotein (VLDL) values were calculated.
An important addition to this data set was the measurement of total antioxidant status, expressing antioxidant activity as a percentage of scavenging capacity. Healthy people typically present between 20 and 60% of the total.
A total of 14 features, including demographic, clinical, biochemical, and oxidative stress markers, were used to train a pattern neural network with 14 input nodes, 10 hidden nodes, and 1 output node. The data was randomly split into training, validation, and test sets, followed by preprocessing steps such as normalization, outlier removal, and missing value handling. We compared the model's performance to other AI models and logistic regression. Pearson correlation and descriptive statistics were used to explore relationships between variables and assess feature associations before training the model.
Key biomarkers that differentiate prediabetic profiles
Of the 14 variables measured, six variables showed significant differences between those with and without prediabetes: age, BMI, waist circumference, antioxidant activity, oral glucose tolerance test (OGTT), and HbA1c level. People with prediabetes showed significantly lower antioxidant capacity, higher oxidative stress, and higher levels of important metabolic indicators such as HbA1c and glucose response.
Boxplot analysis revealed distinct distributions of HbA1c, OGTT, and lipid markers, and reinforced these group differences by showing some distributional differences in fasting blood glucose, although group comparisons did not reach statistical significance. Some parameters showed positively skewed distributions, suggesting clustering of outliers in the prediabetic group. Correlation tests highlighted moderate associations between BMI and waist circumference, and moderate associations between anthropometric measurements and fasting blood glucose, which together capture overlapping but nonoverlapping aspects of metabolic risk.
PNN model demonstrates excellent predictive accuracy
PNN models trained on these variables demonstrated highly accurate classification. We achieved an accuracy of 97.9% on the training set and 95.2% on both the test and validation sets. The overall accuracy across all datasets was 98.3%, with perfect precision, strong recall, and F1 score. Compared to other models, PNN consistently outperformed alternative models, achieving the highest area under the curve (AUC) and strongest error minimization.
What does early risk stratification mean?
This study successfully integrated total antioxidant status into an AI-based prediabetes prediction model for an Indian population, highlighting that oxidative stress is an important yet often overlooked risk marker with potential mechanistic relevance to disease development rather than just a correlational feature.
This finding confirms that waist circumference, BMI, glucose markers, and antioxidant capacity are among the most beneficial predictors, and is consistent with evidence from other populations. PNN achieved superior accuracy compared to traditional machine learning models and demonstrated strong potential as a rapid and low-cost screening tool pending external validation in an independent cohort.
Strengths include a comprehensive set of biochemical and clinical features and the introduction of oxidative stress measurements that add biological depth to risk assessment. However, the unicentric design, modest sample size, and cross-sectional nature limit generalizability and the ability to track change over time.
Overall, PNN provides a robust framework for early detection and risk stratification of prediabetes. Future studies should validate the model in larger multicenter cohorts and formally assess real-world feasibility and stability of performance, while exploring integration with longitudinal clinical data for future clinical and public health applications.
Reference magazines:
- Yesupatham, A., Das, R., Bharani, G., Shaikmeeran, M., Saraswathy, R. (2025). Artificial intelligence models as tools to predict prediabetes. Scientific Reports 15: 43421. DOI: 10.1038/s41598-025-23227-0, https://www.nature.com/articles/s41598-025-23227-0
