
Scan for life signs, also known as biosignatures, using the NASA Tricorder – Astrobiology.com and Grok
The machine learning framework can distinguish between molecules made by biological processes and those formed by non-biological processes, and could be used to analyze samples returned by current and future planetary exploration missions.
José C. Aponte, Amirali Aghazadeh and colleagues used two-dimensional gas chromatography combined with high-resolution time-of-flight mass spectrometry to analyze eight carbonaceous meteorites and 10 geological samples. The authors used this data to develop LifeTracer, a computational framework that processes mass spectrometry data and applies machine learning to identify patterns that distinguish between abiotic and biological sources.

Visualization of the distribution of compounds in meteorite and geological samples and the regression coefficients of a logistic regression model trained on LifeTracer. — NASA/PNAS Nexus
A logistic regression model trained on compound-level features achieved more than 87% accuracy in classifying samples as meteorite or terrestrial. The analysis identified 9,475 peaks in the meteorite sample and 9,070 peaks in the Earth sample, with statistically significant differences in molecular weight distribution and retention time (representing the time it takes for a compound to pass through the two columns of a chromatograph) between the two sample types. Organic compounds in the meteorite samples showed significantly shorter retention times, consistent with the higher volatility of abiotically formed materials.
In this framework, polycyclic aromatic hydrocarbons and alkylated variants are identified as key predictive features, with naphthalene emerging as the most predictive compound for abiotic samples. According to the authors, this approach enables scalable and unbiased biosignature detection and could become a powerful tool for interpreting complex organic mixtures returned by current and future planetary sample return missions.

LifeTracer workflow for collecting, curating, and analyzing mass spectrometry data and developing machine learning models to classify samples. A) Soluble non-polar and semi-polar organics in 8 meteorites and 10 geological samples were analyzed using untargeted 2D gas chromatography combined with high-resolution time-of-flight mass spectrometry ( ), resulting in four-dimensional total ions corresponding to mass-to-charge ratio (m/z), retention time on the first column (RT1), and retention time on the second column. Image (TII) was obtained. (RT2), and intensity (abundance). This figure shows the workflow for meteorite 1 (Aguas Zarkas) and earth sample 1 (Icelandic soil), with distinct peaks at 102 amu and 102 amu, respectively. The device is shown as a cartoon schematic diagram to illustrate the equipment layout. B) The high intensity peak of TII is extracted. Peaks may represent fragment ions derived from the same parent compound. C) Peaks are clustered and tabulated with rows representing features and columns representing samples. Black and gray squares indicate the presence or absence of features, respectively. In this figure, the squares marked A and B correspond to the peaks at m/z = 162 and 102 amu for the Aguas Zarkas and Icelandic soil samples. D) A logistic regression model is trained on the processed data to classify samples into abiotic and biotic classes based on the composition of organic compounds. Analyze features with large regression coefficients to identify organic compounds that play an important role in differentiating biological and abiotic samples. We manually analyzed the fragmentation patterns and exact masses compared to standards to determine the identity or candidate molecule type of each identified compound discovered by LifeTracer. – NASA/PNAS NEXUS
Identification of abiotic and biological organic matter in meteorites and terrestrial samples using machine learning of mass spectrometry data, PNAS NEXUS (Open Access)
astrobiology,
