Decoding complexity through machine learning is redefining scientific discovery

Tucci, P. History of scientific instrumentation and history of science. In Pisano, R. (ed.) A History of Physics: Phenomena, Ideas and Mechanisms, vol. 42 of History of Mechanism and Machine Science (Springer, 2025).

Szalay, A. & Gray, J. Science in an exponential world. Nature 440, 413–414 (2006).

Article
ADS

Google Scholar

Eisenstein, M. Big data: the power of petabytes. Nature 527, S2–S4 (2015).

Article
ADS

Google Scholar

Evans, L. The large hadron collider. Annu. Rev. Nucl. Part. Sci. 61, 435–466 (2011).

Article
ADS

Google Scholar

Dewdney, P. E., Hall, P. J., Schilizzi, R. T. & Lazio, T. J. L. The square kilometre array. Proc. IEEE 97, 1482–1496 (2009).

Article
ADS

Google Scholar

Huang, L. & Peissl, W. Artificial intelligence-a new knowledge and decision-making paradigm? In Hennen, L.et al. (eds.) Technology Assessment in a Globalized World (Springer, 2023).

Resnik, D. & Hosseini, M. The ethics of using artificial intelligence in scientific research: new guidance needed for a new tool. (AI Ethics, 2024).

UNESCO. Recommendation on the ethics of artificial intelligence. SHS/BIO/PI/2021/1 1–43 (UNESCO, 2022).

Shapson-Coe, A. et al. A petavoxel fragment of human cerebral cortex reconstructed at nanoscale resolution. Science 384, eadk4858 (2024).

Article

Google Scholar

Badrulhisham, F., Pogatzki-Zahn, E., Segelcke, D., Spisak, T. & Vollert, J. Machine learning and artificial intelligence in neuroscience: a primer for researchers. Brain Behav. Immun. 115, 470–479 (2024).

Article

Google Scholar

Hutson, M. How ai is being used to accelerate clinical trials. Nature 627https://www.nature.com/articles/d41586-024-00753-x (2024).

Malik, A., Moster, B. P. & Obermeier, C. Exoplanet detection using machine learning. Monthly Not. R. Astronomical Soc. 513, 5505–5516 (2022).

ADS

Google Scholar

Leleu, A. et al. Alleviating the transit timing variation bias in transit surveys – i. rivers: method and detection of a pair of resonant super-earths around kepler-1705. Astron. Astrophys. 655, A66 (2021).

Article
ADS

Google Scholar

Radovic, A. et al. Machine learning at the energy and intensity frontiers of particle physics. Nature 560, 41–48 (2018).

Article
ADS

Google Scholar

Radovic, A., Williams, M., Rousseau, D. et al. Machine learning at the energy and intensity frontiers of particle physics. Nature 560, 41–48 (2018).

Article
ADS

Google Scholar

De Moura, L., Kong, S., Avigad, J., Van Doorn, F. & von Raumer, J. The lean theorem prover (system description). In International Conference on Automated Deduction, 378-388 (Springer, 2015).

Lample, G. et al. Hypertree proof search for neural theorem proving. Adv. Neural Inf. Process. Syst. 35, 26337–26349 (2022).

Google Scholar

Hubert, T. et al. Olympiad-level formal mathematical reasoning with reinforcement learning. Nature 651, 607–613 (2025).

Novikov, A. et al. Alphaevolve: A coding agent for scientific and algorithmic discovery. arXiv preprint arXiv:2506.13131 (2025).

Cervera-Lierta, A., Krenn, M. & Aspuru-Guzik, A. Design of quantum optical experiments with logic artificial intelligence. Quantum 6, 836 (2022).

Article

Google Scholar

Acharya, D. B., Kuppan, K. & Divya, B. Agentic AI: autonomous intelligence for complex goals–a comprehensive survey. IEEE Access 13, 18912–18936 (2025).

Sapkota, R., Roumeliotis, K. I. & Karkee, M. Ai agents vs. agentic ai: A conceptual taxonomy, applications and challenges. arXiv preprint arXiv:2505.10468 (2025).

Nägele, M. & Marquardt, F. Agentic exploration of physics models. arXiv preprint arXiv:2509.24978 (2025).

Davies, A., Veličković, P., Buesing, L. et al. Advancing mathematics by guiding human intuition with AI. Nature 600, 70–74 (2021).

Article
ADS

Google Scholar

Mohri, M., Rostamizadeh, A. & Talwalkar, A.Foundations of machine learning (MIT press, 2018).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. nature 521, 436–444 (2015).

Article
ADS

Google Scholar

Sevilla, J. et al. Compute trends across three eras of machine learning. In 2022 international joint conference on neural networks (IJCNN), 1-8 (IEEE, 2022).

Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2024).

Article
ADS

Google Scholar

Stokes, J. M., Yang, K., Swanson, K. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702 (2020).

Article

Google Scholar

Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

Article

Google Scholar

Vinuesa, R. & Sirmacek, B. Interpretable deep-learning models to help achieve the sustainable development goals. Nat. Mach. Intell. 3, 926 (2021).

Article

Google Scholar

Vinuesa, R., Brunton, S. L. & Mengaldo, G. Explainable AI: Learning from the learners. Preprint arXiv:2601.05525 (2026).

Ferreira, C.Gene expression programming: mathematical modeling by an artificial intelligence, vol. 21 (Springer, 2006).

Schmidt, M. & Lipson, H. Distilling free-form natural laws from experimental data. Science 324, 81–85 (2009).

Article
ADS

Google Scholar

Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 113, 3932–3937 (2016).

Article
ADS
MathSciNet

Google Scholar

Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).

Article
ADS

Google Scholar

Leslie, D. Does the sun rise for ChatGPT? Scientific discovery in the age of generative AI. (AI and Ethics, 2023).

Halevy, A., Norvig, P. & Pereira, F. The unreasonable effectiveness of data. IEEE Intell. Syst. 24, 8–12 (2009).

Article

Google Scholar

Sejnowski, T. J. The unreasonable effectiveness of deep learning in artificial intelligence. Proc. Natl. Acad. Sci. 117, 30033–30038 (2020).

Article
ADS

Google Scholar

Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).

Article
ADS

Google Scholar

Zenil, H. et al. The future of fundamental science led by generative closed-loop artificial intelligence. Front. Artif. Intell. 10 (2026)

Van Dis, E. A., Bollen, J., Zuidema, W., Van Rooij, R. & Bockting, C. L. ChatGPT: five priorities for research. Nature 614, 224–226 (2023).

Article
ADS

Google Scholar

OpenAI. Chatgpt: Optimizing language models for dialogue. https://openai.com/blog/chatgpt/ (2022).

Granjeiro, J. M. et al. The future of scientific writing: Ai tools, benefits, and ethical implications. Braz. Dent. J. 36, e25–6471 (2025).

Article

Google Scholar

Daniotti, S., Wachs, J., Feng, X. & Neffke, F. Who is using AI to code? global diffusion and impact of generative AI. Scienceeadz9311 (2026).

Bellemare-Pepin, A. et al. Divergent creativity in humans and large language models. Scientific Reports 16, 1279 (2026).

M Bran, A. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).

Article

Google Scholar

Gottweis, J. et al. Towards an AI co-scientist. arXiv preprint arXiv:2502.18864 (2025).

Su, H. et al. Many heads are better than one: Improved scientific idea generation by a llm-based multi-agent system. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 28201–28240 (ACL, 2025).

Boiko, D., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).

Article
ADS

Google Scholar

Wang, Q., Downey, D., Ji, H. & Hope, T. Scimon: Scientific inspiration machines optimized for novelty. arXiv preprint arXiv:2305.14259 (2023).

Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

Article
ADS
MathSciNet

Google Scholar

Hsu, C. et al. Learning inverse folding from millions of predicted structures. ICMLhttps://www.biorxiv.org/content/early/2022/04/10/2022.04.10.487779 (2022).

Chen, S., Long, G., Jiang, J., Liu, D. & Zhang, C. Foundation models for weather and climate data understanding: a comprehensive survey. arXiv preprint arXiv:2312.03014 (2023).

Wei, J. et al. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022).

Schaeffer, R., Miranda, B. & Koyejo, S. Are emergent abilities of large language models a mirage?Advances in Neural Information Processing Systems 36 (NIPS, 2024).

Anderson, P. W. More is different: Broken symmetry and the nature of the hierarchical structure of science. Science 177, 393–396 (1972).

Article
ADS

Google Scholar

Birhane, A., Kasirzadeh, A., Leslie, D. & Wachter, S. Science in the age of large language models. Nat. Rev. Phys. 5, 277–280 (2023).

Article

Google Scholar

Fajardo-Fontiveros, O. et al. Fundamental limits to learning closed-form mathematical models from data. Nat. Commun. 14, 1043 (2023).

Article
ADS

Google Scholar

Bechtel, W. & Richardson, R.Discovering Complexity (Princeton University Press, 1993).

Ben-Menahem, Y.Causation in Science (Princeton University Press, 2018).

Krenn, M., Pollice, R., Guo, S. Y. et al. On scientific understanding with artificial intelligence. Nat. Rev. Phys. 4, 761–769 (2022).

Article

Google Scholar

of Sciences, N. A. How AI is shaping scientific discovery https://www.nationalacademies.org/news/2023/11/how-ai-is-shaping-scientific-discovery (2023).

Reddy, C. K. & Shojaee, P. Towards scientific discovery with generative AI: progress, opportunities, and challenges (ACM, 2025).

Nature Editorial AI will transform science – now researchers must tame it. Nature 621, 658 (2023).

Article

Google Scholar

Cory-Wright, R., Cornelio, C., Dash, S. et al. Evolving scientific discovery by unifying data and background knowledge with AI Hilbert. Nat. Commun. 15, 5922 (2024).

Cornelio, C., Dash, S., Austel, V. et al. Combining data and theory for derivable scientific discovery with AI-descartes. Nat. Commun. 14, 1777 (2023).

Bianchini, S., Müller, M. & Pelletier, P. Artificial intelligence in science: an emerging general method of invention. Res. Policy 51, 104604 (2022).

Article

Google Scholar

Martínez-Sánchez, Á, Arranz, G. & Lozano-Durán, A. Decomposing causality into its synergistic, unique, and redundant components. Nat. Commun. 15, 9296 (2024).

Article
ADS

Google Scholar

Cranmer, M. et al. Discovering symbolic models from deep learning with inductive biases. 34th Conference on Neural Information Processing Systems (NeurIPS, 2020).

Cremades, A. et al. Identifying regions of importance in wall-bounded turbulence through explainable deep learning. Nat. Commun. 15, 3864 (2024).

Article
ADS

Google Scholar

Chapman, S. & Cowling, T. G.The mathematical theory of non-uniform gases: an account of the kinetic theory of viscosity, thermal conduction and diffusion in gases (Cambridge University Press, 1990).

Fefferman, C. L. Existence and smoothness of the Navier-Stokes equation. Millennium Prize Probl. 57, 67 (2000).

Google Scholar

Lozano-Durán, A. & Arranz, G. Information-theoretic formulation of dynamical systems: causality, modeling, and control. Phys. Rev. Res. 4, 023195 (2022).

Article

Google Scholar

Angriman, S. et al. Active grid turbulence anomalies through the lens of physics informed neural networks. Results Eng. 24, 103265 (2024).

Article

Google Scholar

Duraisamy, K., Iaccarino, G. & Xiao, H. Turbulence modeling in the age of data. Annu. Rev. Fluid Mech. 51, 357–377 (2019).

Article
ADS
MathSciNet

Google Scholar

Brunton, S. L., Noack, B. R. & Koumoutsakos, P. Machine learning for fluid mechanics. Annu. Rev. Fluid Mech. 52, 477–508 (2020).

Article
ADS
MathSciNet

Google Scholar

Tamayo, D. et al. Predicting the long-term stability of compact multiplanet systems. Proc. Natl. Acad. Sci. USA 117, 18194–18205 (2020).

Article
ADS
MathSciNet

Google Scholar

Borah, S., Sarma, B., Kewming, M., Milburn, G. J. & Twamley, J. Measurement-based feedback quantum control with deep reinforcement learning for a double-well nonlinear potential. Phys. Rev. Lett. 127, 190403 (2021).

Article
ADS

Google Scholar

Yi, K., Moon, Y.-J. & Jeong, H.-J. Application of deep reinforcement learning to major solar flare forecasting. Astrophys. J. Suppl. Ser. 265, 34 (2023).

Article
ADS

Google Scholar

Guastoni, L., Rabault, J., Schlatter, P., Azizpour, H. & Vinuesa, R. Deep reinforcement learning for turbulent drag reduction in channel flows. Eur. Phys. J. E 46, 27 (2023).

Article

Google Scholar

Seo, J. et al. Avoiding fusion plasma tearing instability with deep reinforcement learning. Nature 626, 746–751 (2024).

Article
ADS

Google Scholar

Silver et al. D. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

Article
ADS

Google Scholar

Degrave, J. et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 414–419 (2022).

Article
ADS

Google Scholar

Beeler, C. et al. Optimizing thermodynamic trajectories using evolutionary and gradient-based reinforcement learning. Phys. Rev. E 104, 064128 (2021).

Article
ADS
MathSciNet

Google Scholar

Solera-Rico, A. et al. β-Variational autoencoders and transformers for reduced-order modelling of fluid flows. Nat. Commun. 15, 1361 (2014).

Article
ADS

Google Scholar

Park, S. et al. Optimization of physical quantities in the autoencoder latent space. Sci. Rep. 12, 9003 (2022).

Article
ADS

Google Scholar

Goodfellow, J. et al. Generative adversarial networks. Preprint arXiv:1406.2661 (2014).

Albertsson, K. et al. Machine learning in high energy physics community white paper. In Journal of Physics: Conference Series, vol. 1085, 022008 (IOP Publishing, 2018).

Alexeev, Y. et al. Artificial intelligence for quantum computing. Nat. Commun. 16, 10829 (2025).

Article
ADS

Google Scholar

Wong, C. How AI is improving climate forecasts. Nature 628, 710–712 (2024).

Fawzi et al. A. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022).

Article
ADS

Google Scholar

Romera-Paredes, B. et al. Mathematical discoveries from program search with large language models. Nature 625, 468–475 (2024).

Article
ADS

Google Scholar

Souza, L. F., Rocha Filho, T. M. & Moret, M. A. Relating SARS-CoV-2 variants using cellular automata imaging. Sci. Rep. 12, 10297 (2022).

Article
ADS

Google Scholar

Alert, R., Casademunt, J. & Joanny, J.-F. Active turbulence. Annu. Rev. Condens. Matter Phys. 13, 143–170 (2022).

Article
ADS

Google Scholar

Liu, Z., Chen, Y., Du, Y. & Tegmark, M. Physics-augmented learning: A new paradigm beyond physics-informed learning. arXiv preprint arXiv:2109.13901 (2021).

Moya, B., Badías, A., González, D., Chinesta, F. & Cueto, E. A thermodynamics-informed active learning approach to perception and reasoning about fluids. Comput. Mech. 72, 577–591 (2023).

Article
MathSciNet

Google Scholar

Kapteyn, M. G., Pretorius, J. V. R. & Willcox, K. E. A probabilistic graphical model foundation for enabling predictive digital twins at scale. Nat. Comput. Sci. 1, 337–347 (2021).

Article

Google Scholar

Kneer, S., Sayadi, T., Sipp, D., Schmid, P. & Rigas, G. Symmetry-aware autoencoders: s-PCA and s-NLPCA. Preprint arXiv:2111.02893v3 (2022).

Otto, S. E., Zolman, N., Kutz, J. N. & Brunton, S. L. A unified framework to enforce, discover, and promote symmetry in machine learning. J. Mach. Lear. Res. 26, 1–83 (2025).

Flaschel, M., Kumar, S. & De Lorenzis, L. Automated discovery of generalized standard material models with euclid. Computer Methods Appl. Mech. Eng. 405, 115867 (2023).

Article
MathSciNet

Google Scholar

Schmelzer, M., Dwight, R. P. & Cinnella, P. Discovery of algebraic Reynolds-stress models using sparse symbolic regression. Flow. Turbulence Combust. 104, 579–603 (2020).

Article
ADS

Google Scholar

Wang, M., Chen, C. & Liu, W. Establish algebraic data-driven constitutive models for elastic solids with a tensorial sparse symbolic regression method and a hybrid feature selection technique. J. Mech. Phys. Solids 159, 104742 (2022).

Article
MathSciNet

Google Scholar

Mahmoudabadbozchelou, M., Kamani, K. M., Rogers, S. A. & Jamali, S. Digital rheometer twins: learning the hidden rheology of complex fluids through rheology-informed graph neural networks. Proc. Natl. Acad. Sci. USA 119, e2202234119 (2022).

Article

Google Scholar

Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).

Article
ADS

Google Scholar

Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).

Article
ADS

Google Scholar

Dauparas, J. et al. Robust deep learning-based protein sequence design using ProteinMPNN. Science 378, 49–56 (2022).

Article
ADS

Google Scholar

Watson, J. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).

Article
ADS

Google Scholar

Ingraham, J. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070–1078 (2023).

Article
ADS

Google Scholar

Cornelio, C. et al. Combining data and theory for derivable scientific discovery with ai-descartes. Nat. Commun. 14, 1777 (2023).

Article
ADS

Google Scholar

Irwin, R., Dimitriadis, S., He, J. & Bjerrum, E. J. Chemformer: a pre-trained transformer for computational chemistry. Mach. Learn.: Sci. Technol. 3, 015022 (2022).

ADS

Google Scholar

Ma, H. et al. Machine learning for estimation and control of quantum systems. Natl. Sci. Rev. 12, nwaf269 (2025).

Article

Google Scholar

Bukov, M. et al. Reinforcement learning in different phases of quantum control. Phys. Rev. X 8, 031086 (2018).

Google Scholar

Zhang, X.M., Wei, Z., Asad, R., Yang, X.-C. & Wang, X. When does reinforcement learning stand out in quantum control? a comparative study on state preparation. npj Quantum Inf. 5, 85 (2019).

Article
ADS

Google Scholar

Cai, Z. et al. Quantum error mitigation. Rev. Mod. Phys. 95, 045005 (2023).

Article
ADS
MathSciNet

Google Scholar

Canonici, E., Martina, S., Mengoni, R., Ottaviani, D. & Caruso, F. Machine learning based noise characterization and correction on neutral atoms nisq devices. Adv. Quantum Technol. 7, 2300192 (2024).

Article

Google Scholar

Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017).

Article
ADS

Google Scholar

Ciliberto, C. et al. Quantum machine learning: a classical perspective. Proc. R. Soc. A 474, 20170551 (2018).

Article
ADS
MathSciNet

Google Scholar

Frezat, H., Sommer, J., Fablet, R., Balarac, G. & Lguensat, R. A posteriori learning for quasi-geostrophic turbulence parametrization. J. Adv. Modeling Earth Syst. 14, e2022MS003124 (2022).

Article
ADS

Google Scholar

Molina, M. J. et al. A review of recent and emerging machine learning applications for climate variability and weather phenomena. Artif. Intell. Earth Syst. 2, 220086 (2023).

Google Scholar

Steinmetz, N. A., Zatka-Haas, P., Carandini, M. & Harris, K. D. Distributed coding of choice, action and engagement across the mouse brain. Nature 576, 266–273 (2019).

Article
ADS

Google Scholar

Yao, S. et al. A whole-brain monosynaptic input connectome to neuron classes in mouse visual cortex. Nat. Neurosci. 26, 350–364 (2023).

Article

Google Scholar

Consortium, M. et al. Functional connectomics spanning multiple areas of mouse visual cortex. BioRxiv2021-07 (2021).

Nelson, M. & Rinzel, J. The Hodgkin–Huxley model. The Book of Genesis (Springer, 1995).

Chen, Y., Luo, Y., Liu, Q., Xu, H. & Zhang, D. Symbolic genetic algorithm for discovering open-form partial differential equations (sga-pde). Phys. Rev. Res. 4, 023174 (2022).

Article

Google Scholar

Du, M., Chen, Y. & Zhang, D. Discover: deep identification of symbolic open-form PDEs via enhanced reinforcement-learning. Phys. Rev. Res. 6, 013182 (2024)

Chen, R. T., Rubanova, Y., Bettencourt, J. & Duvenaud, D. K. Neural ordinary differential equations. Advances in neural information processing systems 31 (2018).

Becker, S., Klein, M., Neitz, A., Parascandolo, G. & Kilbertus, N. Predicting ordinary differential equations with transformers. In International Conference on Machine Learning, 1978-2002 (PMLR, 2023).

Sahoo, S., Lampert, C. & Martius, G. Learning equations for extrapolation and control. In International Conference on Machine Learning, 4442-4450 (PMLR, 2018).

Qiu, S. et al. Development and validation of an interpretable deep learning framework for Alzheimer’s disease classification. Brain 143, 1920–1933 (2020).

Article

Google Scholar

Ji, Y., Lotfollahi, M., Wolf, F. A. & Theis, F. J. Machine learning for perturbational single-cell omics. Cell Syst. 12, 522–537 (2021).

Article

Google Scholar

MacLeod, B. P. et al. A self-driving laboratory advances the pareto front for material properties. Nat. Commun. 13, 995 (2022).

Article
ADS

Google Scholar

Du, J., Futoma, J. & Doshi-Velez, F. Model-based reinforcement learning for semi-Markov decision processes with neural ODEs. Adv. Neural Inf. Process. Syst. 33, 19805–19816 (2020).

Google Scholar

Lu, C. et al. Towards end-to-end automation of AI research. Nature 651, 14–919 (2026).

Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. pattern Anal. Mach. Intell. 35, 1798–1828 (2013).

Article
ADS

Google Scholar

Schölkopf, B. et al. Toward causal representation learning. Proc. IEEE 109, 612–634 (2021).

Article
ADS

Google Scholar

Camps-Valls, G. et al. Discovering causal relations and equations from data. Phys. Rep. 1044, 1–68 (2023).

Article
ADS
MathSciNet

Google Scholar

Runge, J. et al. Inferring causation from time series in Earth system sciences. Nat. Commun. 10, 2553 (2019).

Article
ADS

Google Scholar

Lobentanzer, S., Rodriguez-Mier, P., Bauer, S. & Saez-Rodriguez, J. Molecular causality in the advent of foundation models. Preprint arXiv:2401.09558 (2024).

Pfister, N., Bauer, S. & Peters, J. Learning stable and predictive structures in kinetic systems. Proc. Natl. Acad. Sci. USA 116, 25405–25411 (2019).

Article
ADS
MathSciNet

Google Scholar

Champion, K., Lusch, B., Kutz, J. N. & Brunton, S. L. Data-driven discovery of coordinates and governing equations. Proc. Natl. Acad. Sci. USA 116, 22445–22451 (2019).

Article
ADS
MathSciNet

Google Scholar

Lu, C. et al. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Adv. Neural Inf. Process. Syst. 35, 5775–5787 (2022).

ADS

Google Scholar

Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695 (IEEE, 2022).

Watson, J. L. et al. De novo design of protein structure and function with rfdiffusion. Nature 620, 1089–1100 (2023).

Article
ADS

Google Scholar

Zeni, C. et al. A generative model for inorganic materials design. Nature 639, 624–632 (2025).

Nichani, E., Damian, A. & Lee, J. D. How transformers learn causal structure with gradient descent. Preprint arXiv:2402.14735 (2024).

Vetter, J., Macke, J. H. & Gao, R. Generating realistic neurophysiological time series with denoising diffusion probabilistic models. bioRxiv2023-08 (2023).

Kirillov, A. et al. Segment anything arXiv: 2304.02643 (2023).

Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).

Article

Google Scholar

Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint arXiv:2108.07258 (2022).

Carreon, A., Sharma, V. & Raman, V. Automated design optimization via strategic search with large language models. Preprint arXiv:2511.22651 (2025).

Zvyagin, M. et al. Genslms: Genome-scale language models reveal sars-cov-2 evolutionary dynamics. Int. J. High. Perform. Comput. Appl. 37, 683–705 (2023).

Article

Google Scholar

Bzdok, D. et al. Data science opportunities of large language models for neuroscience and biomedicine. Neuron 112, 698–717 (2024).

Article

Google Scholar

Hollmann, N. et al. Accurate predictions on small data with a tabular foundation model. Nature 637, 319–326 (2025).

Article
ADS

Google Scholar

Singh, S. & Hooda, S. A study of challenges and limitations to applying machine learning to highly unstructured data. In 2023 7th International Conference On Computing, Communication, Control And Automation (ICCUBEA), 1–6 (ICCUBEA, 2023).

Jones, C., Castro, D. C., De Sousa Ribeiro, F. et al. A causal perspective on dataset bias in machine learning for medical imaging. Nat. Mach. Intell. 6, 138–146 (2024).

Article

Google Scholar

Vaidya, A., Chen, R. J., Williamson, D. F. K. et al. Demographic bias in misdiagnosis by computational pathology models. Nat. Med. 30, 1174–1190 (2024).

Article

Google Scholar

van Giffen, B., Herhausen, D. & Fahse, T. Overcoming the pitfalls and perils of algorithms: a classification of machine learning biases and mitigation methods. J. Bus. Res. 144, 93–106 (2022).

Article

Google Scholar

Agrawal, A., McHale, J. & Oettl, A. Artificial intelligence and scientific discovery: a model of prioritized search. Res. Policy 53, 104989 (2024).

Article

Google Scholar

Savage, N. Breaking into the black box of artificial intelligence. Nature 66, 101797 (2022).

Quinn, T. P., Jacobs, S., Senadeera, M., Le, V. & Coghlan, S. The three ghosts of medical AI: can the black-box present deliver? Artif. Intell. Med. 124, 102158 (2022).

Article

Google Scholar

Mengaldo, G. Explain the black box for the sake of science: the scientific method in the era of generative artificial intelligence. arXiv preprint arXiv:2406.10557 (2024).

Zhong, X., Gallagher, B., Liu, S. et al. Explainable machine learning in materials science. npj Comput. Mater. 8, 204 (2022).

Gallegos, M., Vassilev-Galindo, V., Poltavsky, I. et al. Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors. Nat. Commun. 15, 4345 (2024).

Moncada-Torres, A., van Maaren, M. C., Hendriks, M. P. et al. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci. Rep. 11, 6968 (2021).

Lever, J., Krzywinski, M. & Altman, N. Model selection and overfitting. Nat. Methods 13, 703–704 (2016).

Article

Google Scholar

Welcome to the AI future? Nat. Astron. 7, 1 (2023).

Sapoval, N., Aghazadeh, A., Nute, M. G. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13, 1728 (2022).

Ridhawi, I. A., Otoum, S., Aloqaily, M. & Boukerche, A. Generalizing AI: Challenges and opportunities for plug and play ai solutions. IEEE Netw. 35, 372–379 (2021).

Article
ADS

Google Scholar

The open reaction database. J. Am. Chem. Soc. 143, 18820-18826 (2021).

Jacobsson, T. J. et al. An open-access database and analysis tool for perovskite solar cells based on the FAIR data principles. Nat. Energy 7, 107–115 (2022).

Article
ADS

Google Scholar

Gražulis, S. et al. Crystallography open database–an open-access collection of crystal structures. J. Appl. Crystallogr. 42, 726–729 (2009).

Article
ADS

Google Scholar

Barba, L.Reproducibility and replicability in science (National Academies Press, 2019).

Singh, P. Systematic review of data-centric approaches in artificial intelligence and machine learning. Data Sci. Manag. 6, 144–157 (2023).

Article

Google Scholar

Yang, J., Soltan, A. A. S., Eyre, D. W. et al. Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning. Nat. Mach. Intell. 5, 884–894 (2023).

Article

Google Scholar

Abrámoff, M. D., Tarver, M. E., Loyo-Berrios, N. et al. Considerations for addressing bias in artificial intelligence for health equity. npj Digit. Med. 6, 170 (2023).

Zheng, P., Zubatyuk, R., Wu, W. et al. Artificial intelligence-enhanced quantum chemical method with broad applicability. Nat. Commun. 12, 7022 (2021).

Article
ADS

Google Scholar

Davies, J. Program good ethics into artificial intelligence. Nature 10, 538291 (2016).

Mhasawade, V., Zhao, Y. & Chunara, R. Machine learning and algorithmic fairness in public and population health. Nat. Mach. Intell. 3, 659–666 (2021).

Article

Google Scholar

Huerta, E. A., Khan, A., Huang, X. et al. Accelerated, scalable and reproducible AI-driven gravitational wave detection. Nat. Astron. 5, 1062–1068 (2021).

Article
ADS

Google Scholar

Space missions out of this world with AI. Nat. Mach. Intell. 5, 183 (2023).

Ma, P. X., Ng, C., Rizk, L. et al. A deep-learning search for technosignatures from 820 nearby stars. Nat. Astron. 7, 492–502 (2023).

ADS

Google Scholar

Popper, K.The Open Universe: An Argument for Indeterminism From the Postscript to The Logic of Scientific Discovery (Routledge, 1992).

Akesson, A., Curtsdotter, A., Eklöf, A. et al. The importance of species interactions in eco-evolutionary community dynamics under climate change. Nat. Commun. 12, 4759 (2021).

Article
ADS

Google Scholar

Physics Nobel scooped by machine-learning pioneers. Naturehttps://www.nature.com/articles/d41586-024-03213-8 (2024).

Chemistry Nobel goes to developers of AlphaFold AI that predicts protein structures. Naturehttps://www.nature.com/articles/d41586-024-03214-7 (2024).

Mitchell, M. Debates on the nature of artificial general intelligence. Science 383, 7869 (2024).

Nature Editorial More-powerful AI is coming. academia and industry must oversee it — together. Nature 636, 273 (2024).

Article

Google Scholar

Menon, S. S. et al. On scientific foundation models: rigorous definitions, key applications, and a comprehensive survey. Neural Netw. 198, 108567 (2026).

Ha, D. & Schmidhuber, J. World models. CoRR arXiv: 1803.10122 (2018).

LeCun, Y. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Rev. 62, 1–62 (2022).

Google Scholar

Mitrokhov, K. Between world models and model worlds: on generality, agency, and worlding in machine learning. AI Soc. 40, 5087–5099 (2025).