Exploring molecular assemblies as biosignatures using mass spectrometry and machine learning

Machine Learning


Exploring molecular assemblies as biosignatures using mass spectrometry and machine learning

Molecular complexity growth rate per bond between Pubchem and coconut 2.0 molecules. A, Bertz, B, Böttcher, C, Ma. D, the database had a similar distribution of binding counts for molecules. – cs.lg

Molecular assembly provides a promising pathway to detect life beyond the Earth, minimizing assumptions based on terrestrial life.

Mass spectrometers will be the centre of future solar system missions, so predicting molecular assembly from data without the need to uncover unknown structures is essential for fair life detection. The ideal agnostic biosignature must be interpretable and experimentally measurable.

Here we show that molecular assembly, a recently developed approach to measuring evolution-generated objects, meets both criteria. First, it is interpretable for detection of life, as opposed to the approach of discounting construction history, as it reflects the assembly of molecules with bonds as components.

Second, it can be physically measured by mass spectrometry, a property that distinguishes structure-based information measurements for molecular complexity from other approaches, so that it can be determined without elucidation of the structure. Molecular assembly can be measured directly using mass spectrometry data, but there are limitations imposed by mission constraints. To address this, we developed a machine learning model that predicts molecular assembly with high accuracy, reducing errors by 3x compared to the baseline model.

Simulated data shows that even small equipment inconsistencies can double model errors and highlight the need for standardization. These results suggest that a standardized mass spectrometry database can enable accurate molecular assembly prediction without structural elucidation and provide a proof of concept for future astrobiology missions.

An overview of the deployment of MA as an agnostic biosignature. Future life detection missions in the solar system allow for in-situ samples to be collected and directly predicted MA scores from MS data without the need for structural elucidation. Space missions using MSN can estimate MA scores with the reusivema algorithm. The current paper shows that space missions using MS1 (and optionally MSN) can infer MA scores in ML. – cs.lg

Lindsay A. Rutter, Abhishek Sharma, Ian Set, David Obeh Alobo, An Goto, Leroy Cronin

Comment: 35 pages, 7 digits, 62 references
Subject: Machine Learning (CS.LG)
Quote: arxiv: 2507.19057 [cs.LG] (Or arxiv: 2507.19057v1 [cs.LG] For this version)
https://doi.org/10.48550/arxiv.2507.19057
Focus to learn more
Submission history
From: Leroy Cronin Prof
[v1] Friday, July 25th, 2025 08:19:15 UTC (1,513 kb)
https://arxiv.org/abs/2507.19057

Astrobiology, nanotechnology,



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *