A new article has been published. dental research journal We are exploring the development of an integrated data cleaning and subtype detection pipeline using unsupervised machine learning to comprehensively analyze and visualize data patterns in the National Health and Nutrition Inspection Survey (NHANES) database.
Using Machine Learning to Uncover Caries Heterogeneity in NHANES, written by Alena Orlenko of Cedars-Sinai Medical Center in Los Angeles, California, USA, addresses the limitations of NHANES, one of the most nationally representative repositories of population-level health-related indicators, by establishing a data cleaning pipeline using a novel outlier detection algorithm and unsupervised machine learning to identify phenotypic subtypes. NHANES caries data.
“By applying the power of machine learning to a large national dataset, the authors identified a key set of factors associated with tooth decay in children and older adults,” said Nick Jakubowics, editor-in-chief. dental research journal. “The next challenge is to use this information to find more effective ways to prevent tooth decay in different groups of people.”
This study demonstrates a robust data cleaning-subtype detection pipeline that can be applied to the investigation of other health conditions using NHANES and similar databases for machine learning predictive modeling. Applying a comprehensive bioinformatics pipeline to NHANES data successfully identified substantial heterogeneity by age in dental caries, suggesting that stratification is important for future predictive modeling.
This integrative approach systematically addresses data quality issues and facilitates exploratory analyzes to uncover data patterns associated with subtypes and variables associated with clinical heterogeneity of caries. This study revealed new associations between dental caries status, lead/contaminant exposure, specific test markers and food types, and sleep patterns, reflecting additional disease markers in a susceptible population. This demonstrates the value of integrating data science techniques with large-scale observational data to gain deeper insights into complex multifactorial diseases.
sauce:
International Dental, Oral, and Craniofacial Research Association
Reference magazines:
Orenko, A., et al. (2025) Using machine learning to reveal caries heterogeneity in NHANES. dental research journal. doi:10.1177/00220345251398027. https://journals.sagepub.com/doi/10.1177/00220345251398027
