Chemistry-informed deep learning model for predicting stereoselectivity and absolute configuration in asymmetric hydrogenation

Machine Learning


  • von Lilienfeld, O. A., Müller, K.-R. & Tkatchenko, A. Exploring chemical compound space with quantum-based machine learning. Nat. Rev. Chem. 4, 347–358 (2020).

    Article 

    Google Scholar 

  • Wouters, O. J., McKee, M. & Luyten, J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. JAMA 323, 844–853 (2020).

    Article 

    Google Scholar 

  • Gromski, P. S., Henson, A. B., Granda, J. M. & Cronin, L. How to explore chemical space using algorithms and automation. Nat. Rev. Chem. 3, 119–128 (2019).

    Article 

    Google Scholar 

  • DiMasi, J. A., Grabowski, H. G. & Hansen, R. W. Innovation in the pharmaceutical industry: new estimates of R&D costs. J. Health Econ. 47, 20–33 (2016).

    Article 

    Google Scholar 

  • Pensak, D. A. & Corey, E. J. LHASA—logic and heuristics applied to synthetic analysis. ACS Symp. Ser. 61, 1–32 (1977).

    Article 

    Google Scholar 

  • Wipke, W. T., Braun, H., Smith, G., Choplin, F. & Sieber, W. K. SECS—simulation and evaluation of chemical synthesis: strategy and planning. ACS Symp. Ser. 61, 92–127 (1977).

    Google Scholar 

  • Kayala, M. A., Azencott, C.-A., Chen, J. H. & Baldi, P. Learning to predict chemical reactions. J. Chem. Inf. Model. 51, 2209–2222 (2011).

    Article 

    Google Scholar 

  • Houk, K. N. & Cheong, P. H.-Y. Computational prediction of small-molecule catalysts. Nature 455, 309–313 (2008).

    Article 

    Google Scholar 

  • Jin, W., Coley, C. W., Barzilay, R. & Jaakkola, T. Predicting organic reaction outcomes with Weisfeiler-Lehman network. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 2604–2613 (NeurIPS, 2017).

  • Ahn, S., Hong, M., Sundararajan, M., Ess, D. H. & Baik, M.-H. Design and optimization of catalysts based on mechanistic insights derived from quantum chemical reaction modeling. Chem. Rev. 119, 6509–6560 (2019).

    Article 

    Google Scholar 

  • Coley, C. W. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).

    Article 

    Google Scholar 

  • Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).

    Article 

    Google Scholar 

  • Neel, A. J., Milo, A., Sigman, M. S. & Toste, F. D. Enantiodivergent fluorination of allylic alcohols: data set design reveals structural interplay between achiral directing group and chiral anion. J. Am. Chem. Soc. 138, 3863–3875 (2016).

    Article 

    Google Scholar 

  • Knowles, R. R. & Jacobsen, E. N. Attractive noncovalent interactions in asymmetric catalysis: Links between enzymes and small molecule catalysts. Proc. Natl Acad. Sci. USA 107, 20678–20685 (2010).

    Article 

    Google Scholar 

  • Bi, H. et al. Non-autoregressive electron redistribution modeling for reaction prediction. In Proc. 38th International Conference on Machine Learning, Vol. 139 (eds Meila, M. & Zhang, T.) 904–913 (PMLR, 2021).

  • Thakkar, A., Kogej, T., Reymond, J.-L., Engkvist, O. & Bjerrum, E. J. Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain. Chem. Sci. 11, 154–168 (2019).

    Article 

    Google Scholar 

  • Keto, A. et al. Data-efficient, chemistry-aware machine learning predictions of diels–alder reaction outcomes. J. Am. Chem. Soc. 146, 16052–16061 (2024).

    Article 

    Google Scholar 

  • Schwaller, P. et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5, 1572–1583 (2019).

    Article 

    Google Scholar 

  • Lu, J. & Zhang, Y. Unified deep learning model for multitask reaction predictions with explanation. J. Chem. Inf. Model. 62, 1376–1387 (2022).

    Article 

    Google Scholar 

  • Irwin, R., Dimitriadis, S., He, J. & Bjerrum, E. J. Chemformer: a pre-trained transformer for computational chemistry. Mach. Learn. Sci. Technol. 3, 015022 (2022).

    Article 

    Google Scholar 

  • Pesciullesi, G., Schwaller, P., Laino, T. & Reymond, J.-L. Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates. Nat. Commun. 11, 4874 (2020).

    Article 

    Google Scholar 

  • Tetko, I. V., Karpov, P., Van Deursen, R. & Godin, G. State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat. Commun. 11, 5575 (2020).

    Article 

    Google Scholar 

  • Li, S.-W., Xu, L.-C., Zhang, C., Zhang, S.-Q. & Hong, X. Reaction performance prediction with an extrapolative and interpretable graph model based on chemical knowledge. Nat. Commun. 14, 3569 (2023).

    Article 

    Google Scholar 

  • Knowles, W. S. Asymmetric hydrogenations (Nobel lecture). Angew. Chem. Int. Ed. 41, 1998–2007 (2002).

    Article 

    Google Scholar 

  • Eberhardt, L., Armspach, D., Harrowfield, J. & Matt, D. BINOL-derived phosphoramidites in asymmetric hydrogenation: can the presence of a functionality in the amino group influence the catalytic outcome? Chem. Soc. Rev. 37, 839–864 (2008).

    Article 

    Google Scholar 

  • Verendel, J. J., Pàmies, O., Diéguez, M. & Andersson, P. G. Asymmetric hydrogenation of olefins using chiral Crabtree-type catalysts: scope and limitations. Chem. Rev. 114, 2130–2169 (2014).

    Article 

    Google Scholar 

  • Zhang, Z., Butt, N. A. & Zhang, W. Asymmetric hydrogenation of nonaromatic cyclic substrates. Chem. Rev. 116, 14769–14827 (2016).

    Article 

    Google Scholar 

  • Massaro, L., Zheng, J., Margarita, C. & Andersson, P. G. Enantioconvergent and enantiodivergent catalytic hydrogenation of isomeric olefins. Chem. Soc. Rev. 49, 2504–2522 (2020).

    Article 

    Google Scholar 

  • Janssen-Müller, D., Schlepphorst, C. & Glorius, F. Privileged chiral N-heterocyclic carbene ligands for asymmetric transition-metal catalysis. Chem. Soc. Rev. 46, 4845–4854 (2017).

    Article 

    Google Scholar 

  • Wen, J., Wang, F. & Zhang, X. Asymmetric hydrogenation catalyzed by first-row transition metal complexes. Chem. Soc. Rev. 50, 3211–3237 (2021).

    Article 

    Google Scholar 

  • Wang, Q. et al. Rhodium-catalyzed enantioselective hydrogenation of tetrasubstituted α-acetoxy β-enamido esters: a new approach to chiral α-hydroxyl-β-amino acid derivatives. J. Am. Chem. Soc. 136, 16120–16123 (2014).

    Article 

    Google Scholar 

  • Yoshikai, Y., Mizuno, T., Nemoto, S. & Kusuhara, H. Difficulty in chirality recognition for transformer architectures learning chemical structures from string representations. Nat. Commun. 15, 1197 (2024).

    Article 

    Google Scholar 

  • Sigman, M. S., Harper, K. C., Bess, E. N. & Milo, A. The development of multidimensional analysis tools for asymmetric catalysis and beyond. Acc. Chem. Res. 49, 1292–1301 (2016).

    Article 

    Google Scholar 

  • Reid, J. P. & Sigman, M. S. Comparing quantitative prediction methods for the discovery of small-molecule chiral catalysts. Nat. Rev. Chem. 2, 290–305 (2018).

    Article 

    Google Scholar 

  • Santiago, C. B., Guo, J.-Y. & Sigman, M. S. Predictive and mechanistic multivariate linear regression models for reaction development. Chem. Sci. 9, 2398–2412 (2018).

    Article 

    Google Scholar 

  • Gallarati, S. et al. Reaction-based machine learning representations for predicting the enantioselectivity of organocatalysts. Chem. Sci. 12, 6879–6889 (2021).

    Article 

    Google Scholar 

  • Ravasco, J. M. J. M. & Coelho, J. A. S. Predictive multivariate models for bioorthogonal inverse-electron demand Diels–Alder reactions. J. Am. Chem. Soc. 142, 4235–4241 (2020).

    Article 

    Google Scholar 

  • Singh, S. et al. A unified machine-learning protocol for asymmetric catalysis as a proof of concept demonstration using asymmetric hydrogenation. Proc. Natl Acad. Sci. USA 117, 1339–1345 (2020).

    Article 

    Google Scholar 

  • Xu, L.-C. et al. Enantioselectivity prediction of pallada-electrocatalysed C–H activation using transition state knowledge in machine learning. Nat. Synth. 2, 321–330 (2023).

    Article 

    Google Scholar 

  • Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C–N cross-coupling using machine learning. Science 360, 186–190 (2018).

    Article 

    Google Scholar 

  • Reid, J. P. & Sigman, M. S. Holistic prediction of enantioselectivity in asymmetric catalysis. Nature 571, 343–348 (2019).

    Article 

    Google Scholar 

  • Zahrt, A. F. et al. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 363, eaau5631 (2019).

    Article 

    Google Scholar 

  • Xu, L.-C. et al. Towards data-driven design of asymmetric hydrogenation of olefins: database and hierarchical learning. Angew. Chem. Int. Ed. 60, 22804–22811 (2021).

    Article 

    Google Scholar 

  • Moskal, M., Beker, W., Szymkuć, S. & Grzybowski, B. A. Scaffold-directed face selectivity machine-learned from vectors of non-covalent interactions. Angew. Chem. Int. Ed. 60, 15230–15235 (2021).

    Article 

    Google Scholar 

  • Cheng L. et al. Chemistry-informed deep learning model for predicting stereoselectivity and absolute configuration in asymmetric hydrogenation. Zenodo https://doi.org/10.5281/zenodo.17346605 (2025).

  • Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 4768–4777 (NeurIPS, 2017).

  • Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent individualized feature attribution for tree ensembles. Preprint at https://doi.org/10.48550/arXiv.1802.03888 (2019).

  • Heberle, H., Zhao, L., Schmidt, S., Wolf, T. & Heinrich, J. XSMILES: interactive visualization for molecules, SMILES and XAI attribution scores. J. Cheminform. 15, 2 (2023).

    Article 

    Google Scholar 

  • Landis, C. R. & Halpern, J. Asymmetric hydrogenation of methyl (Z)-α-acetamidocinnamate catalyzed by [1,2-bis(phenyl-o-anisoyl)phosphino)ethane]rhodium(I): kinetics, mechanism and origin of enantioselection. J. Am. Chem. Soc. 109, 1746–1754 (1987).

  • Mohar, B. & Stephan, M. Practical enantioselective hydrogenation of α-aryl- and α-carboxyamidoethylenes by rhodium(i)-1,2-bis[(otert-butoxyphenyl)(phenyl)phosphino]ethane. Adv. Synth. Catal. 355, 594–600 (2013).

  • Li, C. et al. Stereoelectronic effects in ligand design: enantioselective rhodium-catalyzed hydrogenation of aliphatic cyclic tetrasubstituted enamides and concise synthesis of (R)-tofacitinib. Angew. Chem. Int. Ed. 58, 13573–13583 (2019).

    Article 

    Google Scholar 

  • Rdkit: open-source chemoinformatics and machine learning. RDKit.org (accessed 15 July 2024); https://rdkit.org/

  • Ahmad, W., Simon, E., Chithrananda, S., Grand, G. & Ramsundar, B. ChemBERTa-2: towards chemical foundation models. Preprint at https://doi.org/10.48550/arXiv.2209.01712 (2022).

  • Kim, Y. Convolutional neural networks for sentence classification. In Proc. Conference on Empirical Methods in Natural Language Processing (eds Moschitti, A., Pang, B. & Daelemans, W.) 1746–1751 (EMNLP, 2014).

  • Wang, S., Huang, M. & Deng, Z. Densely connected CNN with multi-scale feature attention for text classification. In Proc. 27th International Joint Conference on Artificial Intelligence, 4468–4474 (IJCAI, 2018).

  • Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30 (eds Guyon, I. et al.) 6000–6010 (NeurIPS, 2017).

  • Nam, J. & Kim, J. Linking the neural machine translation and the prediction of organic chemistry reactions. Preprint at https://doi.org/10.48550/arXiv.1612.09529 (2016).

  • Schwaller, P., Gaudin, T., Lányi, D., Bekas, C. & Laino, T. “Found in translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091–6098 (2018).

    Article 

    Google Scholar 

  • Gehring, J., Auli, M., Grangier, D., Yarats, D. & Dauphin, Y. N. Convolutional sequence to sequence learning. In Proc. 34th International Conference on Machine Learning, Vol. 70 (eds Precup, D. & Teh, Y. W.) 1243–1252 (PMLR, 2017).

  • Zhang, X., Zhou, X., Lin, M. & Sun, J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 6848–6856 (IEEE, 2018).

  • He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, 770–778 (IEEE, 2016).

  • He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. In Proc. 14th European Conference on Computer Vision (eds Leibe, B., Matas, J., Sebe, N. & Welling, M.) 630–645 (ECCV, 2016).



  • Source link