A new study demonstrated a two-step machine learning strategy to overcome spICP-TOF-MS data bias and improve nanoparticle classification. This approach enables highly accurate discrimination of man-made, accidental, and natural nanoparticle types, providing a robust and efficient method for classifying nanoparticles in complex samples. To do.
Single-particle inductively coupled plasma time-of-flight mass spectrometry (spICP-TOF-MS) holds promise for the quantification and classification of nanoparticles (NPs) based on their elemental composition. However, systematic biases exist in the spICP-TOF-MS data, making accurate NP classification difficult. To address this issue, researchers at Iowa State University developed a multi-stage semi-supervised machine learning (SSML) strategy that effectively overcomes the inherent bias and improves NP-type classification. . This work was published in Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy (1).
SSML is an approach that combines multiple steps to train a machine learning model using a combination of labeled and unlabeled data. The first stage uses labeled data to train an initial model. This model is used to identify and isolate “noise classes” or misclassified instances in the data. In the second stage, these noise classes are incorporated into the training process to enrich the labeled data and improve model performance. SSML aims to enhance the model’s ability to accurately classify new, unknown data by iteratively refining the model by adding previously misclassified instances. This approach leverages available labeled and unlabeled data to create more robust and effective machine learning models, so it is useful when labeled data is limited or expensive to obtain. Especially helpful.
The research team’s approach involves identifying “noise classes” that explain systematic particle misclassification and incorporating them into the SSML model. This will allow us to develop a more robust classification model and accurately identify NP types. The researchers used cerium(IV) oxide, ferrocerium mischmetal, and bastnaesite mineral NPs as representatives of engineered nanoparticles (ENPs), incidental nanoparticles (INPs), and natural nanoparticles (NNPs) types, respectively. conducted a case study.
The final SSML model achieved an excellent receiver operating characteristic curve (ROC AUC) value of 0.979. ENP, INP and NNP had very low false positive rates of 0.030, 0.001 and 0, respectively. This level of accuracy enables reliable particle type classification even in mixed samples with different concentrations. The researchers demonstrated the capability of his two-stage SSML model for quantifying particle types over a wide range of concentrations over two orders of magnitude.
The significance of this work lies in its ability to address the biases present in spICP-TOF-MS training data, providing a straightforward and robust approach for incorporating machine learning models into NP classification strategies. By overcoming the limitations of spICP-TOF-MS, this innovative methodology advances the field of nanoparticle analysis, enabling accurate identification and identification of NPs in a wide range of applications such as environmental monitoring, nanotoxicology, and nanomaterials development. It has the potential to facilitate characterization.
With further refinement and validation, the SSML approach developed by the Iowa State University team is expected to improve the accuracy and efficiency of nanoparticle classification, ultimately contributing significantly to advances in nanoscience and technology.
reference
(1) Backman, RL; Gundlach-Graham, A. Machine learning analysis for classifying nanoparticles from noisy spICP-TOFMS data. Spectrochymica Acta Part A: Mol. Biomol. Spectrometer. 2023, asap. DOI: 10.1039/D3JA00081H