Szeliski, R. Computer vision: algorithms and applications. Springer Nature, (2022).
Khurana, D., Koli, A., Khatter, K. & Singh, S. Natural language processing: State of the art, current trends and challenges. Multimed. Tools Appl. 82, 3713–3744 (2023).
Google Scholar
Kober, J. & Peters, J. Reinforcement learning in robotics: A survey. In Reinforcement Learning, pages 579–610. Springer, (2012).
Dutta, S. Reinforcement Learning with TensorFlow: A beginner’s guide to designing self-learning systems with TensorFlow and OpenAI Gym. Packt Publishing Ltd, (2018).
Recht, B. A tour of reinforcement learning: The view from continuous control. Annu. Rev. Control, Robot., Auton. Syst. 2, 253–279 (2019).
Google Scholar
Agarwal, A., Jiang, N., Kakade, S. M. & Sun, W. Reinforcement learning: Theory and algorithms. CS Dept., UW Seattle, Seattle, WA, USA, Tech. Rep, (2019).
Van Hasselt, H., Guez, A., & Silver, D. Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, (2016).
Wang, Z. et al. Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning, pages 1995–2003. PMLR (2016).
Qureshi, A. H., Boots, B., & Yip, M. C. Adversarial imitation via variational inverse reinforcement learning. In International Conference on Learning Representations. https://openreview.net/forum?id=HJlmHoR5tQ (2019).
Cheng, C.-A., Yan, X., Wagener, N. & Bootsm, B. Fast policy learning through imitation and reinforcement. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence, (2018).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529 (2015).
Google Scholar
Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).
Google Scholar
Silver, D. et al. A general reinforcement learning algorithm that masters chess, Shogi, and Go through self-play. Science 362, 1140–1144 (2018).
Google Scholar
Berner, C. et al. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680, (2019).
Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).
Google Scholar
Kaufmann, E. et al. Champion-level drone racing using deep reinforcement learning. Nature 620, 982–987 (2023).
Google Scholar
Degrave, J. et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 414–419 (2022).
Google Scholar
Popova, M., Isayev, O. & Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
Google Scholar
Gazzola, M., Hejazialhosseini, B. & Koumoutsakos, P. Reinforcement learning and wavelet-adapted vortex methods for simulations of self-propelled swimmers. SIAM J. Sci. Comput. 36, B622–B639 (2014).
Google Scholar
Colabrese, S., Gustavsson, K., Celani, A. & Biferale, L. Flow navigation by smart microswimmers via reinforcement learning. Phys. Rev. Lett. 118, 158004 (2017).
Google Scholar
Pivot, C., Mathelin, L., Cordier, L., Guéniat, F. & Noack, B. R. A continuous reinforcement learning strategy for closed-loop control in fluid dynamics. In 35th AIAA Applied Aerodynamics Conference, page 3566, (2017).
Verma, S., Novati, G. & Koumoutsakos, P. Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proc. Natl Acad. Sci. 115, 5849–5854 (2018).
Google Scholar
Biferale, L., Bonaccorso, F., Buzzicotti, M., Clark Di Leoni, P. & Gustavsson, K. Zermelo’s problem: Optimal point-to-point navigation in 2d turbulent flows using reinforcement learning. Chaos: Interdiscip. J. Nonlinear Sci. 29, 103138 (2019).
Google Scholar
Novati, G., Mahadevan, L. & Koumoutsakos, P. Controlled gliding and perching through deep-reinforcement-learning. Phys. Rev. Fluids 4, 093902 (2019).
Google Scholar
Fan, D., Yang, L., Wang, Z., Triantafyllou, M. S. & Karniadakis, G. E. Reinforcement learning for bluff body active flow control in experiments and simulations. Proc. Natl Acad. Sci. 117, 26091–26098 (2020).
Google Scholar
Rabault, J. & Kuhnle, A. Deep Reinforcement Learning Applied to Active Flow Control, page 368–390. Cambridge University Press, (2023).
Beintema, G., Corbetta, A., Biferale, L., & Toschi, F. Controlling rayleigh-bénard convection via reinforcement learning. J. Turbul, 21, 585–605 (2020).
Novati, G., de Laroussilhe, H. L. & Koumoutsakos, P. Automating turbulence modelling by multi-agent reinforcement learning. Nat. Mach. Intell. 3, 87–96 (2021).
Google Scholar
Bae, H. J. & Koumoutsakos, P. Scientific multi-agent reinforcement learning for wall-models of turbulent flows. Nat. Commun. 13, 1443 (2022).
Google Scholar
Levine, S., Kumar, A., Tucker, G., & Fu, J. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643, (2020).
Schaul, T., Quan, J., Antonoglou, I., & Silver, D. Prioritized experience replay. Proceedings of the International Conference on Learning Representations, (2016).
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., & Wayne, G. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, (2019).
Marcin Andrychowicz et al. Hindsight experience replay. Adv. Neural Inf. Process. Syst. 30, (2017).
Zhu, Z., Lin, K., Jain, A. K., & Zhou, J. Transfer learning in deep reinforcement learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, (2023).
Wang, J. X. et al. Learning to reinforcement learn. In Proceedings of the Annual Meeting of the Cognitive Science Society, vol 39, https://escholarship.org/uc/item/1tn6q2t7 (2016).
Finn, C., Abbeel, P., & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135. PMLR, (2017).
Sutton, R. S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Machine Learning Proceedings 1990, pages 216–224. Elsevier, (1990).
Wang, T. et al. Benchmarking model-based reinforcement learning. arXiv preprint arXiv:1907.02057, (2019).
Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl Acad. Sci. 113, 3932–3937 (2016).
Google Scholar
Kaiser, E., Kutz, J. N. & Brunton, S. L. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proc. R. Soc. A 474, 20180335 (2018).
Google Scholar
Lore, J. D. et al. Time-dependent SOLPS-ITER simulations of the tokamak plasma boundary for model predictive control using SINDy. Nucl. Fusion 63, 046015 (2023).
Google Scholar
Farsi, M. & Liu, J. Structured online learning-based control of continuous-time nonlinear systems. IFAC-PapersOnLine 53, 8142–8149 (2020).
Google Scholar
Arora, R., da Silva, B. C., & Moss, E. Model-based reinforcement learning with SINDy. In Decision Awareness in Reinforcement Learning Workshop at ICML, https://openreview.net/forum?id=3xBZY7LGorK (2022).
Fasel, U., Kutz, J. N., Brunton, B. W. & Brunton, S. L. Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control. Proc. R. Soc. A 478, 20210904 (2022).
Google Scholar
Tunyasuvunakool, S. et al. dm_control: Software and tasks for continuous control. Softw. Impacts 6, 100022 (2020).
Google Scholar
Brockman, G. et al. Openai gym. arXiv preprint arXiv:1606.01540, (2016).
Lagemann, C. et al. Hydrogym: A reinforcement learning platform for fluid dynamics. In 7th Annual Learning for Dynamics\& Control Conference, pages 497–512. PMLR, (2025).
Lagemann, C. et al. Hydrogym-gpu: From 2d to 3d benchmark environments for reinforcement learning in fluid flows. In Proceedings of the 35th International Conference on Parallel Computational Fluid Dynamics (ParCFD2024), Bonn, Germany, September (2024).
Rudy, S. H., Brunton, S. L., Proctor, J. L., & Kutz, J. N. Data-driven discovery of partial differential equations. Science Advances, 3(e1602614), (2017).
Schaeffer, H. & McCalla, S. G. Sparse model selection via integral terms. Phys. Rev. E 96, 023302 (2017).
Google Scholar
Reinbold, P. A. K., Gurevich, D. R. & Grigoriev, R. O. Using noisy or incomplete data to discover models of spatiotemporal dynamics. Phys. Rev. E 101, 010203 (2020).
Google Scholar
Messenger, D. A. & Bortz, D. M. Weak SINDy for partial differential equations. J. Comput. Phys. 443, 110525 (2021).
Google Scholar
Messenger, D. A. & Bortz, D. M. Weak SINDy: Galerkin-based data-driven model selection. Multiscale Model. Simul. 19, 1474–1497 (2021).
Google Scholar
Kaptanoglu, A. A., Callaham, J. L., Hansen, C. J., Aravkin, A., & Brunton, S. L. Promoting global stability in data-driven models of quadratic nonlinear dynamics. Physical Review Fluids, 6(094401), (2021).
Forootani, A., Goyal, P., & Benner, P. A robust sindy approach by combining neural networks and an integral form. arXiv preprint arXiv:2309.07193, (2023).
Schroeder, M. Synthesis of low-peak-factor signals and binary sequences with low autocorrelation (corresp.). IEEE Trans. Inf. Theory 16, 85–89 (1970).
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, (2017).
Ng, A. Y., Harada, D., & Russell, S. Policy invariance under reward transformations: Theory and application to reward shaping. In Icml, volume 99, pages 278–287. Citeseer, (1999).
Arora, S. & Doshi, P. A survey of inverse reinforcement learning: Challenges, methods and progress. Artif. Intell. 297, 103500 (2021).
Google Scholar
Hussein, A., Gaber, M. M., Elyan, E. & Jayne, C. Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 50, 1–35 (2017).
Google Scholar
Mania, H., Guy, A., & Recht, B. Simple random search of static linear policies is competitive for reinforcement learning. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc. (2018).
Rajeswaran, A., Lowrey, K., Todorov, E. V., & Kakade, S. M. Towards generalization and simplicity in continuous control. Advances in Neural Information Processing Systems, 30, (2017).
Zhu, F., Jing, D., Leve, F. & Ferrari, S. Nn-poly: Approximating common neural networks with Taylor polynomials to imbue dynamical system constraints. Front. Robot. AI 9, 968305 (2022).
Google Scholar
Clavera, I. et al. Model-based reinforcement learning via meta-policy optimization. In Conference on Robot Learning, pages 617–629. PMLR, (2018).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Google Scholar
Choi, S., Choi, H. & Kang, S. Characteristics of flow over a rotationally oscillating cylinder at low Reynolds number. Phys. Fluids 14, 2767–2777 (2002).
Google Scholar
Rabault, J., Kuchta, M., Jensen, A., Réglade, U. & Cerardi, N. Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. J. Fluid Mech. 865, 281–302 (2019).
Google Scholar
Liaw, R. et al. Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118, https://openreview.net/forum?id=KHlWtBm8BJ; https://docs.ray.io/en/latest/tune/index.html#citing-tune (2018).
Weng, J. et al. Tianshou: A highly modularized deep reinforcement learning library. J. Mach. Learn. Res. 23, 12275–12280 (2022).
Google Scholar
Huang, S. et al. Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms. J. Mach. Learn. Res. 23, 1–18 (2022).
Google Scholar
Franceschetti, M., Lacoux, C., Ohouens, R., Raffin, A., & Sigaud, O. Making reinforcement learning work on swimmer. arXiv preprint arXiv:2208.07587, (2022).
Jaderberg, M. et al. Population based training of neural networks. arXiv preprint arXiv:1711.09846, (2017).
Deng, N., Noack, B. R., Morzyński, M. & Pastur, L. R. Low-order model for successive bifurcations of the fluidic pinball. J. Fluid Mech. 884, A37 (2020).
Google Scholar
Champion, K., Lusch, B., Kutz, J. N. & Brunton, S. L. Data-driven discovery of coordinates and governing equations. Proc. Natl Acad. Sci. 116, 22445–22451 (2019).
Google Scholar
Cranmer, M. et al. Discovering symbolic models from deep learning with inductive biases. Adv. Neural Inf. Process. Syst. 33, 17429–17442 (2020).
Udrescu, S.-M. & Tegmark, M. AI Feynman: A physics-inspired method for symbolic regression. Sci. Adv. 6, eaay2631 (2020).
Google Scholar
Kim, S. et al. Integration of neural network-based symbolic regression in deep learning for scientific discovery. IEEE Trans. Neural Netw. Learn. Syst. 32, 4166–4177 (2020).
Google Scholar
Sahoo, S., Lampert, C., & Martius, G. Learning equations for extrapolation and control. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4442–4450. PMLR, 10–15 Jul (2018).
Both, G.-J., Choudhury, S., Sens, P. & Kusters, R. Deepmod: Deep learning for model discovery in noisy data. J. Comput. Phys. 428, 109985 (2021).
Google Scholar
Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. Curiosity-driven exploration by self-supervised prediction. In International conference on machine learning, pages 2778–2787. PMLR, (2017).
Loiseau, J.-C. & Brunton, S. L. Constrained sparse Galerkin regression. J. Fluid Mech. 838, 42–67 (2018).
Google Scholar
Otto, S. E., Zolman, N., Kutz, J. N., & Brunton, S. L. A unified framework to enforce, discover, and promote symmetry in machine learning. arXiv preprint arXiv:2311.00212, (2023).
Ahmadi, A. A. & El Khadir, B. Learning dynamical systems with side information. In Learning for Dynamics and Control, pages 718–727. PMLR, (2020).
Bramburger, J. J., Dahdah, S., & Forbes, J. R. Synthesizing control laws from data using sum-of-squares optimization. In 2024 IEEE Conference on Control Technology and Applications (CCTA), pages 505–510. IEEE, (2024).
Wolf, F., Botteghi, N., Fasel, U., & Manzoni, A. Interpretable and efficient data-driven discovery and control of distributed systems. arXiv preprint arXiv:2411.04098, (2024).
Bakarji, J., Champion, K., Kutz, J. N. & Brunton, S. L. Discovering governing equations from partial measurements with deep delay autoencoders. Proc. R. Soc. A 479, 20230422 (2023).
Google Scholar
Salimans, T., Ho, J., Chen, X., Sidor, S., & Sutskever, I. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, (2017).
Lopez, B. T., Slotine, J.-J. E., & How, J. P. Dynamic tube MPC for nonlinear systems. In 2019 American Control Conference (ACC), pages 1655–1662. IEEE (2019).
Zolman, N., Lagemann, C., Fasel, U., Kutz, J. N. & Brunton, S. L. sindy-rl_data (revision d295c18), (2025).
Zolman, N., Lagemann, C., Fasel, U., Kutz, J. N. & Brunton, S. L. nzolman/sindy-rl: nat-comms-v1, September (2025).
Zolman, N., Lagemann, C., Fasel, U., Kutz, J. N. & Brunton, S. L. nzolman/sindy-rl_3dairfoil: nat-comms-v1, September (2025).
