Better algorithms to predict cell behavior

Machine Learning


In a preprint published in arXivAltos Labs researchers described a machine learning algorithm that predicts end-to-end how a cell’s gene expression will respond to intervention.

The need for prediction

Simulating biological processes on a computer is a very difficult task. Although advanced algorithms such as Google’s AlphaFold have revolutionized protein folding, the complete biochemistry of a cell is orders of magnitude more complex.

One way around this is to simply use live cells. Modern RNA-seq technologies allow relatively easy testing of the effects of genetic perturbations and small molecule interventions. However, even with this technique, there is still vast scope for opportunity, different cell types respond differently, and multiple perturbations at once are often required to change cell behavior. [1].

Therefore, machine learning algorithms aim to predict what kinds of perturbations are valuable to the field. in silicothen you can test these predictions with in vitro before such studies are followed in animals and humans. Interestingly, previous research has found that simpler algorithms are more useful in a wide range of applications, and that removing extraneous constraints improves the generalization ability of these models. [2].

Flow algorithms with unusual design choices

To achieve this goal, these researchers created PRiMeFlow, an algorithm that works directly within gene expression space, rather than compressing information into a lower-dimensional space as previous algorithms do. [3]. This flow algorithm uses learned probabilities to transform known information into previously unknown configurations.

The authors point out that their chosen architecture, U-net, is typically considered suboptimal for the task at hand. The order of gene expression is arbitrary, and U-net is intended for spatially oriented tasks that involve measuring relationships between nearby data points. Multilayer perceptrons (MLPs) are usually considered a better option, but converting the U-net flow data to MLPs only worsened the model’s predictions. They admit that they do not know why this is the case and suggest research involving cross-attention mechanisms that may better aggregate information without spatial bias.

best performance

In its optimal configuration, PRiMeFlow achieved state-of-the-art performance on three key benchmarks that are part of the PerturBench platform. Two of these benchmarks represent covariate transfer. that is, the ability of the model to predict the effects of perturbations under different conditions, including cell types that may not be included in the training data. In the third measure of combined prediction, it outperformed many other models on all but one metric.

On a civilian test set of human embryonic stem cells, PRiMeFlow showed very good performance, and this performance was enhanced with further fine-tuning. The optimally fine-tuned PRiMeFlow model was found to be closest to: in vitro Results for all models on the leaderboard.

The researchers offered a vision for the future, suggesting that their work could form the basis of virtual cells and, in theory, be used to model entire virtual organisms. However, making such a vision a reality requires overcoming various computational and algorithmic challenges.

I have a small request to make of you. we are a non-profit foundationunlike other organizations, we have no shareholders and no products to sell. All news and educational content is free We want everyone to be able to read it, but that means we rely on help from people like you. All contributions, no matter how big or small, Support independent journalism and support our future.

literature

[1] Kazuya Watanabe, Naoki Punchy, Shinji Noguchi, Hiroshi Suzuki, and Tetsu Hon (2019). Combinatorial perturbation analysis reveals diverse regulation of mesenchymal genes during the epithelial-to-mesenchymal transition. NPJ Systems Biology and Applications, 5(1), 21.

advertisement

Novos Lab Advertisement

[2] Lotfollahi, M., Klimovskaia Susmelj, A., De Donno, C., Hetzel, L., Ji, Y., Ibarra, I. L., … & Theis, F. J. (2023). Predicting cellular responses to complex perturbations in high-throughput screening. molecular systems biology, 19(6), MSB202211517.

[3] Klein, D., Fleck, J. S., Bobrovskiy, D., Zimmermann, L., Becker, S., Palma, A., … & Theis, F. J. (2025). CellFlow enables generative single-cell phenotypic modeling using flow matching. BioRxiv2025-04.



Source link