von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).
Google Scholar
Wouters, O. J., McKee, M. & Luyten, J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. JAMA 323, 844–853 (2020).
Google Scholar
Gromski, P. S., Henson, A. B., Granda, J. M. & Cronin, L. How to explore chemical space using algorithms and automation. Nat. Rev. Chem. 3, 119–128 (2019).
Google Scholar
DiMasi, J. A., Grabowski, H. G. & Hansen, R. W. Innovation in the pharmaceutical industry: new estimates of R&D costs. J. Health Econ. 47, 20–33 (2016).
Google Scholar
Pensak, D. A. & Corey, E. J. LHASA—logic and heuristics applied to synthetic analysis. ACS Symp. Ser. 61, 1–32 (1977).
Google Scholar
Wipke, W. T., Braun, H., Smith, G., Choplin, F. & Sieber, W. K. SECS—simulation and evaluation of chemical synthesis: strategy and planning. ACS Symp. Ser. 61, 92–127 (1977).
Kayala, M. A., Azencott, C.-A., Chen, J. H. & Baldi, P. Learning to predict chemical reactions. J. Chem. Inf. Model. 51, 2209–2222 (2011).
Google Scholar
Houk, K. N. & Cheong, P. H.-Y. Computational prediction of small-molecule catalysts. Nature 455, 309–313 (2008).
Google Scholar
Jin, W., Coley, C. W., Barzilay, R. & Jaakkola, T. Predicting organic reaction outcomes with Weisfeiler-Lehman network. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 2604–2613 (NeurIPS, 2017).
Ahn, S., Hong, M., Sundararajan, M., Ess, D. H. & Baik, M.-H. Design and optimization of catalysts based on mechanistic insights derived from quantum chemical reaction modeling. Chem. Rev. 119, 6509–6560 (2019).
Google Scholar
Coley, C. W. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).
Google Scholar
Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).
Google Scholar
Neel, A. J., Milo, A., Sigman, M. S. & Toste, F. D. Enantiodivergent fluorination of allylic alcohols: data set design reveals structural interplay between achiral directing group and chiral anion. J. Am. Chem. Soc. 138, 3863–3875 (2016).
Google Scholar
Knowles, R. R. & Jacobsen, E. N. Attractive noncovalent interactions in asymmetric catalysis: Links between enzymes and small molecule catalysts. Proc. Natl Acad. Sci. USA 107, 20678–20685 (2010).
Google Scholar
Bi, H. et al. Non-autoregressive electron redistribution modeling for reaction prediction. In Proc. 38th International Conference on Machine Learning, Vol. 139 (eds Meila, M. & Zhang, T.) 904–913 (PMLR, 2021).
Thakkar, A., Kogej, T., Reymond, J.-L., Engkvist, O. & Bjerrum, E. J. Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain. Chem. Sci. 11, 154–168 (2019).
Google Scholar
Keto, A. et al. Data-efficient, chemistry-aware machine learning predictions of diels–alder reaction outcomes. J. Am. Chem. Soc. 146, 16052–16061 (2024).
Google Scholar
Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5, 1572–1583 (2019).
Google Scholar
Lu, J. & Zhang, Y. Unified deep learning model for multitask reaction predictions with explanation. J. Chem. Inf. Model. 62, 1376–1387 (2022).
Google Scholar
Irwin, R., Dimitriadis, S., He, J. & Bjerrum, E. J. Chemformer: a pre-trained transformer for computational chemistry. Mach. Learn. Sci. Technol. 3, 015022 (2022).
Google Scholar
Pesciullesi, G., Schwaller, P., Laino, T. & Reymond, J.-L. Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates. Nat. Commun. 11, 4874 (2020).
Google Scholar
Tetko, I. V., Karpov, P., Van Deursen, R. & Godin, G. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat. Commun. 11, 5575 (2020).
Google Scholar
Li, S.-W., Xu, L.-C., Zhang, C., Zhang, S.-Q. & Hong, X. Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge. Nat. Commun. 14, 3569 (2023).
Google Scholar
Knowles, W. S. Asymmetric hydrogenations (Nobel lecture). Angew. Chem. Int. Ed. 41, 1998–2007 (2002).
Google Scholar
Eberhardt, L., Armspach, D., Harrowfield, J. & Matt, D. BINOL-derived phosphoramidites in asymmetric hydrogenation: can the presence of a functionality in the amino group influence the catalytic outcome? Chem. Soc. Rev. 37, 839–864 (2008).
Google Scholar
Verendel, J. J., Pàmies, O., Diéguez, M. & Andersson, P. G. Asymmetric hydrogenation of olefins using chiral Crabtree-type catalysts: scope and limitations. Chem. Rev. 114, 2130–2169 (2014).
Google Scholar
Zhang, Z., Butt, N. A. & Zhang, W. Asymmetric hydrogenation of nonaromatic cyclic substrates. Chem. Rev. 116, 14769–14827 (2016).
Google Scholar
Massaro, L., Zheng, J., Margarita, C. & Andersson, P. G. Enantioconvergent and enantiodivergent catalytic hydrogenation of isomeric olefins. Chem. Soc. Rev. 49, 2504–2522 (2020).
Google Scholar
Janssen-Müller, D., Schlepphorst, C. & Glorius, F. Privileged chiral N-heterocyclic carbene ligands for asymmetric transition-metal catalysis. Chem. Soc. Rev. 46, 4845–4854 (2017).
Google Scholar
Wen, J., Wang, F. & Zhang, X. Asymmetric hydrogenation catalyzed by first-row transition metal complexes. Chem. Soc. Rev. 50, 3211–3237 (2021).
Google Scholar
Wang, Q. et al. Rhodium-catalyzed enantioselective hydrogenation of tetrasubstituted α-acetoxy β-enamido esters: a new approach to chiral α-hydroxyl-β-amino acid derivatives. J. Am. Chem. Soc. 136, 16120–16123 (2014).
Google Scholar
Yoshikai, Y., Mizuno, T., Nemoto, S. & Kusuhara, H. Difficulty in chirality recognition for transformer architectures learning chemical structures from string representations. Nat. Commun. 15, 1197 (2024).
Google Scholar
Sigman, M. S., Harper, K. C., Bess, E. N. & Milo, A. The development of multidimensional analysis tools for asymmetric catalysis and beyond. Acc. Chem. Res. 49, 1292–1301 (2016).
Google Scholar
Reid, J. P. & Sigman, M. S. Comparing quantitative prediction methods for the discovery of small-molecule chiral catalysts. Nat. Rev. Chem. 2, 290–305 (2018).
Google Scholar
Santiago, C. B., Guo, J.-Y. & Sigman, M. S. Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci. 9, 2398–2412 (2018).
Google Scholar
Gallarati, S. et al. Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts. Chem. Sci. 12, 6879–6889 (2021).
Google Scholar
Ravasco, J. M. J. M. & Coelho, J. A. S. Predictive multivariate models for bioorthogonal inverse-electron demand Diels–Alder reactions. J. Am. Chem. Soc. 142, 4235–4241 (2020).
Google Scholar
Singh, S. et al. A unified machine-learning protocol for asymmetric catalysis as a proof of concept demonstration using asymmetric hydrogenation. Proc. Natl Acad. Sci. USA 117, 1339–1345 (2020).
Google Scholar
Xu, L.-C. et al. Enantioselectivity prediction of pallada-electrocatalysed C–H activation using transition state knowledge in machine learning. Nat. Synth. 2, 321–330 (2023).
Google Scholar
Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).
Google Scholar
Reid, J. P. & Sigman, M. S. Holistic prediction of enantioselectivity in asymmetric catalysis. Nature 571, 343–348 (2019).
Google Scholar
Zahrt, A. F. et al. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 363, eaau5631 (2019).
Google Scholar
Xu, L.-C. et al. Towards data-driven design of asymmetric hydrogenation of olefins: database and hierarchical learning. Angew. Chem. Int. Ed. 60, 22804–22811 (2021).
Google Scholar
Moskal, M., Beker, W., Szymkuć, S. & Grzybowski, B. A. Scaffold-directed face selectivity machine-learned from vectors of non-covalent interactions. Angew. Chem. Int. Ed. 60, 15230–15235 (2021).
Google Scholar
Cheng L. et al. Chemistry-informed deep learning model for predicting stereoselectivity and absolute configuration in asymmetric hydrogenation. Zenodo https://doi.org/10.5281/zenodo.17346605 (2025).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 4768–4777 (NeurIPS, 2017).
Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent individualized feature attribution for tree ensembles. Preprint at https://doi.org/10.48550/arXiv.1802.03888 (2019).
Heberle, H., Zhao, L., Schmidt, S., Wolf, T. & Heinrich, J. XSMILES: interactive visualization for molecules, SMILES and XAI attribution scores. J. Cheminform. 15, 2 (2023).
Google Scholar
Landis, C. R. & Halpern, J. Asymmetric hydrogenation of methyl (Z)-α-acetamidocinnamate catalyzed by [1,2-bis(phenyl-o-anisoyl)phosphino)ethane]rhodium(I): kinetics, mechanism and origin of enantioselection. J. Am. Chem. Soc. 109, 1746–1754 (1987).
Mohar, B. & Stephan, M. Practical enantioselective hydrogenation of α-aryl- and α-carboxyamidoethylenes by rhodium(i)-1,2-bis[(o–tert-butoxyphenyl)(phenyl)phosphino]ethane. Adv. Synth. Catal. 355, 594–600 (2013).
Li, C. et al. Stereoelectronic effects in ligand design: enantioselective rhodium-catalyzed hydrogenation of aliphatic cyclic tetrasubstituted enamides and concise synthesis of (R)-tofacitinib. Angew. Chem. Int. Ed. 58, 13573–13583 (2019).
Google Scholar
Rdkit: open-source chemoinformatics and machine learning. RDKit.org (accessed 15 July 2024); https://rdkit.org/
Ahmad, W., Simon, E., Chithrananda, S., Grand, G. & Ramsundar, B. ChemBERTa-2: towards chemical foundation models. Preprint at https://doi.org/10.48550/arXiv.2209.01712 (2022).
Kim, Y. Convolutional neural networks for sentence classification. In Proc. Conference on Empirical Methods in Natural Language Processing (eds Moschitti, A., Pang, B. & Daelemans, W.) 1746–1751 (EMNLP, 2014).
Wang, S., Huang, M. & Deng, Z. Densely connected CNN with multi-scale feature attention for text classification. In Proc. 27th International Joint Conference on Artificial Intelligence, 4468–4474 (IJCAI, 2018).
Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 6000–6010 (NeurIPS, 2017).
Nam, J. & Kim, J. Linking the neural machine translation and the prediction of organic chemistry reactions. Preprint at https://doi.org/10.48550/arXiv.1612.09529 (2016).
Schwaller, P., Gaudin, T., Lányi, D., Bekas, C. & Laino, T. “Found in translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091–6098 (2018).
Google Scholar
Gehring, J., Auli, M., Grangier, D., Yarats, D. & Dauphin, Y. N. Convolutional sequence to sequence learning. In Proc. 34th International Conference on Machine Learning, Vol. 70 (eds Precup, D. & Teh, Y. W.) 1243–1252 (PMLR, 2017).
Zhang, X., Zhou, X., Lin, M. & Sun, J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 6848–6856 (IEEE, 2018).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, 770–778 (IEEE, 2016).
He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. In Proc. 14th European Conference on Computer Vision (eds Leibe, B., Matas, J., Sebe, N. & Welling, M.) 630–645 (ECCV, 2016).
