Open and sustainable AI: challenges, opportunities and the road ahead in the life sciences

Walsh, I. et al. DOME: recommendations for supervised machine learning validation in biology. Nat. Methods 18, 1122–1127 (2021).

Article
CAS
PubMed

Google Scholar

Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

Article
PubMed
PubMed Central

Google Scholar

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

Article
CAS
PubMed

Google Scholar

Luo, M. et al. Artificial intelligence for life sciences: a comprehensive guide and future trends. Innov. Life 2, 100105 (2024).

Article
CAS

Google Scholar

Paysan-Lafosse, T. et al. The Pfam protein families database: embracing AI/ML. Nucleic Acids Res. 53, D523–D534 (2025).

Article
CAS
PubMed
PubMed Central

Google Scholar

Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022).

Article
CAS
PubMed
PubMed Central

Google Scholar

Kapoor, S. & Narayanan, A. Leakage and the reproducibility crisis in machine-learning-based science. Patterns 4, 100804 (2023).

Article
PubMed
PubMed Central

Google Scholar

Clark, T. et al. AI-readiness for biomedical data: Bridge2AI recommendations. Preprint at bioRxiv https://doi.org/10.1101/2024.10.23.619844 (2024).

Tedersoo, L. et al. Data sharing practices and data availability upon request differ across scientific disciplines. Sci. Data 8, 192 (2021).

Article
PubMed
PubMed Central

Google Scholar

Laurinavichyute, A., Yadav, H. & Vasishth, S. Share the code, not just the data: a case study of the reproducibility of articles published in the Journal of Memory and Language under the open data policy. J. Mem. Lang. 125, 104332 (2022).

Article

Google Scholar

Alper, P. et al. RDMkit: A research data management toolkit for life sciences. Patterns 6, 101345 (2025).

Article
PubMed
PubMed Central

Google Scholar

Pistoia Alliance. The FAIR toolkit for life science industry. https://fairtoolkit.pistoiaalliance.org (2020).

Ouyang, W. et al. BioImage Model Zoo: a community-driven resource for accessible deep learning in bioimage analysis. Preprint at bioRxiv https://doi.org/10.1101/2022.06.07.495102 (2022).

Avsec, Ž et al. The Kipoi repository accelerates community exchange and reuse of predictive models for genomics. Nat. Biotechnol. 37, 592–600 (2019).

Article
CAS
PubMed
PubMed Central

Google Scholar

Akhtar, M. et al. Croissant: a metadata format for ML-ready datasets. In Proceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning (eds Hulsebos, M., Interlandi, M. & Shankar, S.) 1–6 (Association for Computing Machinery, 2024).

Research Data Alliance. RDA FAIR for Machine Learning (FAIR4ML) Interest Group. https://www.rd-alliance.org/groups/fair-machine-learning-fair4ml-ig/activity (2022).

Beam, A. L., Manrai, A. K. & Ghassemi, M. Challenges to the reproducibility of machine learning models in health care. JAMA 323, 305–306 (2020).

Article
PubMed
PubMed Central

Google Scholar

Unsal, S. et al. Learning functional properties of proteins with language models. Nat. Mach. Intell. 4, 227–245 (2022).

Article

Google Scholar

Sapkota, R., Roumeliotis, K. I. & Karkee, M. AI agents vs. agentic AI: A conceptual taxonomy, applications and challenges. Inf. Fusion 126, 103599 (2026).

Article

Google Scholar

Schwartz, R., Dodge, J., Smith, N. A. & Etzioni, O. Green AI. ACM 63, 54–63 (2020).

Article

Google Scholar

White, M. et al. The Model Openness Framework: promoting completeness and openness for reproducibility, transparency, and usability in artificial intelligence. Preprint at https://doi.org/10.48550/arXiv.2403.13784 (2024).

Lekadir, K. et al. FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. BMJ 388, e081554 (2025).

Article
PubMed
PubMed Central

Google Scholar

Kapoor, S. et al. REFORMS: consensus-based recommendations for machine-learning-based science. Sci. Adv. 10, eadk3452 (2024).

Article
PubMed
PubMed Central

Google Scholar

Machine Learning Commons. MLCommons: better AI for everyone. https://mlcommons.org (2025).

FAIR Advanced Research and Reproducibility (FARR) Research Coordination Network. FARR RCN. https://www.farr-rcn.org (2025).

Rai, A. Explainable AI: from black box to glass box. J. Acad. Mark. Sci. 48, 137–141 (2020).

Article

Google Scholar

Afroogh, S., Akbari, A., Malone, E., Kargar, M. & Alambeigi, H. Trust in AI: progress, challenges, and future directions. Humanit. Soc. Sci. Commun. 11, 1568 (2024).

Article

Google Scholar

Leslie, D. Understanding Artificial Intelligence Ethics and Safety: a Guide for the Responsible Design and Implementation of AI Systems in the Public Sector (The Alan Turing Institute, 2019).

Dignum, V. Responsible artificial intelligence: from principles to practice. Preprint at https://doi.org/10.48550/arXiv.2205.10785 (2022).

Ahdritz, G. et al. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nat. Methods 21, 1514–1524 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

Collins, G. S. et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 384, e078378 (2024).

Article

Google Scholar

Schmied, C. et al. Community-developed checklists for publishing images and image analyses. Nat. Methods 21, 170–181 (2024).

Article
CAS
PubMed

Google Scholar

Liu, X. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–1374 (2020).

Article
CAS
PubMed
PubMed Central

Google Scholar

Cruz Rivera, S. et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat. Med. 26, 1351–1363 (2020).

Article
CAS
PubMed
PubMed Central

Google Scholar

Kaggle. Kaggle: your machine learning and data science community. https://www.kaggle.com (2025).

Wolf, T. et al. HuggingFace’s Transformers: state-of-the-art natural language processing. Preprint at https://doi.org/10.48550/arXiv.1910.03771 (2019).

Turon, G., Legese, A., Arora, D. & Duran-Frigola, M. Ersilia Model Hub: a repository of AI/ML models for infectious and neglected tropical diseases. Zenodo https://doi.org/10.5281/ZENODO.7274645 (2025).

European Organization For Nuclear Research (CERN) & OpenAIRE. Zenodo https://doi.org/10.25495/7GXK-RD71 (2013).

Leo, S. et al. Recording provenance of workflow runs with RO-Crate. PLoS ONE 19, e0309210 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

Huerta, E. A. et al. FAIR for AI: an interdisciplinary and international community building perspective. Sci. Data 10, 487 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Castro, L. J. et al. FAIR4ML-schema. Zenodo https://doi.org/10.5281/ZENODO.14002310 (2024).

Pistoia Alliance. Pistoia Alliance organisation website. https://www.pistoiaalliance.org (2025).

Open Data Institute. A framework for AI-ready data. https://theodi.hacdn.io/media/documents/A_framework_for_AI-ready_data.pdf (2025).

Scientific Computing World. Pistoia Alliance launches DataFAIRy to drive AI adoption. https://www.scientific-computing.com/news/pistoia-alliance-launches-datafairy-drive-ai-adoption (2024).

Desai, A., Abdelhamid, M. & Padalkar, N. R. What is reproducibility in artificial intelligence and machine learning research? AI Mag. 46, e70004 (2025).

Carter, R. E., Attia, Z. I., Lopez-Jimenez, F. & Friedman, P. A. Pragmatic considerations for fostering reproducible research in artificial intelligence. NPJ Digit. Med. 2, 42 (2019).

Article
PubMed
PubMed Central

Google Scholar

Tiwari, D. D. et al. BioModelsML: building a FAIR and reproducible collection of machine learning models in life sciences and medicine for easy reuse. Preprint at bioRxiv https://doi.org/10.1101/2023.05.22.540599 (2023).

Merkel, D. Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014, 2 (2014).

Google Scholar

Anaconda. Conda https://anaconda.org/anaconda/conda (2025).

Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).

Article
PubMed

Google Scholar

Köster, J. & Rahmann, S. Snakemake: a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).

Article
PubMed

Google Scholar

Galaxy Community, T. he et al. The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update. Nucleic Acids Res. 52, W83–W94 (2024).

Article

Google Scholar

Heil, B. J. et al. Reproducibility standards for machine learning in the life sciences. Nat. Methods 18, 1132–1135 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Bisong, E. Google Colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform Ch. 7, 59–64 (Apress, 2019).

Anthony, L. F. W., Kanding, B. & Selvan, R. Carbontracker: tracking and predicting the carbon footprint of training deep learning models. Preprint at https://doi.org/10.48550/arXiv.2007.03051 (2020).

Ritchie, H. et al. Hardware and energy cost to train notable AI systems. Our World in Data https://ourworldindata.org/grapher/hardware-and-energy-cost-to-train-notable-ai-systems (2023).

Gailhofer, P. et al. The Role of Artificial Intelligence in the European Green Deal (European Parliament, 2023).

Bolón-Canedo, V. et al. A review of green artificial intelligence: towards a more sustainable future. Neurocomputing 599, 128096 (2024).

Article

Google Scholar

EMBL. Sustainability: reports and resources. https://www.embl.org/about/info/sustainability/reports-resources (2025).

Yamada, T. et al. Frugal machine learning: making AI more efficient, accessible, and sustainable. Preprint at https://doi.org/10.36227/techrxiv.173385981.11102720/v1 (2024).

Tornede, T. et al. Towards green automated machine learning: status quo and future directions. J. Artif. Intell. Res. 77, 427–457 (2023).

Article

Google Scholar

Johnson, S. G., Simon, G. & Aliferis, C. Regulatory aspects and ethical legal societal implications (ELSI). In Artificial Intelligence and Machine Learning in Health Care and Medical Sciences (eds Simon, G. J. & Aliferis, C.) Ch. 16, 659–692 (Springer, 2024).

Jefferson, E. et al. GRAIMatter: guidelines and resources for AI model access from TrusTEd research environments (GRAIMatter). Int. J. Popul. Data Sci. 7, 2005 (2022).

PubMed Central

Google Scholar

European Commission. AI for Health: evaluation of applications & datasets (AHEAD). CORDIS https://cordis.europa.eu/project/id/101183031 (2024).

European Commission. HORIZON Europe: ELIXIR-STEERS project. CORDIS https://cordis.europa.eu/project/id/101131096 (2024).

SustAInML. Sustainable AI and Machine Learning. https://sustainml.eu (2021).

Software Sustainability Institute. Green DiSC: a digital sustainability certification. https://www.software.ac.uk/GreenDiSC (2025).

Geoscience and Remote Sensing Society (GRSS). GeoCroissant: a metadata framework for geospatial ML-ready datasets. https://www.grss-ieee.org/events/geocroissant-a-metadata-framework-for-geospatial-ml-ready-datasets (2024).

Mitchell, M. et al. Model cards for model reporting. In Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency (eds Friedler, S. A. & Wilson, C.) 220–229 (Association for Computing Machinery, 2019).

Pushkarna, M., Zaldivar, A. & Kjartansson, O. Data cards: purposeful and transparent dataset documentation for responsible AI. Preprint at https://doi.org/10.48550/ARXIV.2204.01075 (2022).

Dasoulas, I., Yang, D. & Dimou, A. MLSea: a semantic layer for discoverable machine learning. In The Semantic Web (eds Meroño Peñuela, A. et al.) Ch. 11, 178–198 (Springer, 2024).

SciLifeLab Data Centre. SciLifeLab: funder requirements and FAIR ML models. https://serve.scilifelab.se/docs/model-serving/fair (2025).

Van Geest, G. et al. Using Glittr.org to find, compare and re-use online materials for training and education. PLoS ONE 19, e0308729 (2024).

Article
PubMed
PubMed Central

Google Scholar

Data Carpentry. Data Carpentry lessons. https://datacarpentry.org/lessons (2025).

The Turing Way Community. The Turing way: a handbook for reproducible, ethical and collaborative research. Zenodo https://doi.org/10.5281/ZENODO.15213042 (2025).

ONNX. ONNX: Open Neural Network Exchange. https://onnx.ai/ (2025).

Attafi, O. A. et al. DOME registry: implementing community-wide recommendations for reporting supervised machine learning in biology. GigaScience 13, giae094 (2024).

Article
PubMed
PubMed Central

Google Scholar

Kurtzer, G. M., Sochat, V. & Bauer, M. W. Singularity: scientific containers for mobility of compute. PLoS ONE 12, e0177459 (2017).

Article
PubMed
PubMed Central

Google Scholar

Docker. Docker Hub container image library. https://hub.docker.com (2025).

Yuen, D. et al. The Dockstore: enhancing a community platform for sharing reproducible and accessible computational protocols. Nucleic Acids Res. 49, W624–W632 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Clyburne-Sherin, A., Fei, X. & Green, S. A. Computational reproducibility via containers in psychology. Meta Psychol. 3, 892 (2019).

Article

Google Scholar

Kryshtafovych, A. et al. Critical assessment of methods of protein structure prediction (CASP): round XV. Proteins 91, 1539–1549 (2023).

Article
CAS
PubMed
PubMed Central

Google Scholar

Xiong, Z. et al. Crowdsourced identification of multi-target kinase inhibitors for RET- and TAU- based disease: the Multi-Targeting Drug DREAM Challenge. PLoS Comput. Biol. 17, e1009302 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Capella-Gutierrez, S. et al. Lessons learned: recommendations for establishing critical periodic scientific benchmarking. Preprint at bioRxiv https://doi.org/10.1101/181677 (2017).

Ash, J. T. & Adams, R. P. On warm-starting neural network training. In Advances in Neural Information Processing Systems 33 (eds Larochelle, H. et al.) 3884–3894 (Curran Associates, 2020).

Tmamna, J. et al. Pruning deep neural networks for green energy-efficient models: a survey. Cogn. Comput. 16, 2931–2952 (2024).

Article

Google Scholar

Krishnan, S. & Faust, A. Quantization for fast and environmentally sustainable reinforcement learning. Google Research Blog https://research.google/blog/quantization-for-fast-and-environmentally-sustainable-reinforcement-learning (2021).

Yuan, Y. et al. The impact of knowledge distillation on the energy consumption and runtime efficiency of NLP models. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering – Software Engineering for AI (eds Cleland-Huang, J., Bosch, J., Muccini, H. & Lewis, G. A.) 129–133 (Association for Computing Machinery, 2024).

Tabbakh, A. et al. Towards sustainable AI: a comprehensive framework for Green AI. Discov. Sustain. 5, 408 (2024).

Article

Google Scholar

Guo, D. et al. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature 645, 633–638 (2025).

Article
CAS
PubMed
PubMed Central

Google Scholar

Green Software Foundation. Green software patterns. https://patterns.greensoftware.foundation (2025).

Green Software Foundation. Green Software Foundation. https://greensoftware.foundation (2025).

TOP500.org. Green500 List: November 2023. https://top500.org/lists/green500/2023/11 (2023).

Performance Optimisation and Productivity Centre of Excellence in HPC. https://pop-coe.eu (2025).

Schmidt, V. et al. Machine learning CO₂ impact calculator. https://mlco2.github.io/impact (2025).

GitHub. Official Repository of MICCAI FLARE Challenges. https://github.com/JunMa11/FLARE (2025).

Henderson, P. et al. Towards the systematic reporting of the energy and carbon footprints of machine learning. J. Mach. Learn. Res. 21, 10039–10081 (2020).

Google Scholar

Ravi, N. et al. FAIR principles for AI models with a practical application for accelerated high energy diffraction microscopy. Sci. Data 9, 657 (2022).

Article
PubMed
PubMed Central

Google Scholar

Farrell, G. OSAI ecosystem components data. Zenodo https://doi.org/10.5281/zenodo.15391273 (2025).

RSQKit Community. Research software quality kit (RSQKit). Zenodo https://doi.org/10.5281/zenodo.14923572 (2025).

Gavriilidis, G. I. et al. APNet, an explainable sparse deep learning model to discover differentially active drivers of severe COVID-19. Bioinformatics 41, btaf063 (2025).

Article
CAS
PubMed
PubMed Central

Google Scholar

D’Anna, F. et al. A research data management (RDM) community for ELIXIR. F1000Res. 13, 230 (2024).

BY-COVID. Infectious Diseases Toolkit (IDTk). https://www.infectious-diseases-toolkit.org (2025).

Mungall, C. Open knowledge bases in the age of generative AI. F1000Res. https://doi.org/10.7490/F1000RESEARCH.1120248.1 (2025).

Yiyao, L. et al. OmicsNavigator: an LLM-driven multi-agent system for autonomous zero-shot biological analysis in spatial omics. Preprint at bioRxiv https://doi.org/10.1101/2025.07.21.665821 (2025).

Huang, K. et al. Biomni: a general-purpose biomedical AI agent. Preprint at bioRxiv https://doi.org/10.1101/2025.05.30.656746 (2025).

Wei, J. et al. From AI for science to agentic science: a survey on autonomous scientific discovery. Preprint at https://doi.org/10.48550/arXiv.2508.14111 (2025).

Kim, J. et al. The cost of dynamic reasoning: demystifying AI agents and test-time scaling from an AI infrastructure perspective. Preprint at https://doi.org/10.48550/arXiv.2506.04301 (2025).

European Commission. The EU AI Act. https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai (2024).

National Science Foundation. National Artificial Intelligence Research Resource (NAIRR) pilot. https://www.nsf.gov/focus-areas/artificial-intelligence/nairr (2024).

The White House. America’s AI action plan. https://www.whitehouse.gov/wp-content/uploads/2025/07/Americas-AI-Action-Plan.pdf (2025).

Declaration on Research Assessment (DORA). https://sfdora.org/about-dora (2025).

CoARA. Coalition for Advancing Research Assessment. https://coara.org (2025).

Wang, Y. et al. SimpleFold: folding proteins is simpler than you think. Preprint at https://doi.org/10.48550/arXiv.2509.18480 (2025).

Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

AlphaFold3: why did Nature publish it without its code? Nature 629, 728 (2024).

Callaway, E. AI protein-prediction tool AlphaFold3 is now more open. Nature 635, 531–532 (2024).

Article
CAS
PubMed

Google Scholar

Global Alliance for Genomics & Health (GA4GH). https://www.ga4gh.org (2025).

Pascucci, E. et al. Progressing towards personalised medicine: the Genomic Data Infrastructure (GDI) project. Eur. J. Public Health 34, ckae144.1956 (2024).

Article
PubMed Central

Google Scholar

Heredia, I. et al. AI4EOSC: a federated cloud platform for artificial intelligence in scientific research. Preprint at https://arxiv.org/abs/2512.16455 (2025).