Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Google Scholar
Berman, H., Henrick, K. & Nakamura, H. Announcing the worldwide Protein Data Bank. Nat. Struct. Mol. Biol. 10, 980 (2003).
Google Scholar
Mohs, R. C. & Greig, N. H. Drug discovery and development: role of basic biological research. Alzheimer’s Dement. Transl. Res. Clin. Interv. 3, 651–657 (2017).
Google Scholar
Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery. Pharm. Rev. 66, 334–395 (2014).
Google Scholar
Thiel, W. Semiempirical quantum-chemical methods. WIREs Comput. Mol. Sci. 4, 145–157 (2014).
Google Scholar
Hollingsworth, S. A. & Dror, R. O. Molecular dynamics simulation for all. Neuron 99, 1129–1143 (2018).
Google Scholar
Siebenmorgen, T. & Zacharias, M. Computational prediction of protein–protein binding affinities. WIREs Comput. Mol. Sci. 10, e1448 (2020).
Google Scholar
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Google Scholar
Kmiecik, S. et al. Coarse-grained protein models and their applications. Chem. Rev. 116, 7898–7936 (2016).
Google Scholar
Spicher, S. & Grimme, S. Robust atomistic modeling of materials, organometallic, and biochemical systems. Angew. Chem. Int. Ed. 59, 15665–15673 (2020).
Google Scholar
Vandenbrande, S., Waroquier, M., Speybroeck, V. V. & Verstraelen, T. The monomer electron density force field (MEDFF): a physically inspired model for noncovalent interactions. J. Chem. Theory Comput. 13, 161–179 (2017).
Google Scholar
Wang, J. & Dokholyan, N. V. Yuel: improving the generalizability of structure-free compound–protein interaction prediction. J. Chem. Inf. Model. 62, 463–471 (2022).
Google Scholar
Ponder, J. W. et al. Current status of the AMOEBA polarizable force field. J. Phys. Chem. B 114, 2549–2564 (2010).
Google Scholar
Chen, B. et al. Automated discovery of fundamental variables hidden in experimental data. Nat. Comput Sci. 2, 433–442 (2022).
Google Scholar
Durrant, J. D. & McCammon, J. A. NNScore: a neural-network-based scoring function for the characterization of protein−ligand complexes. J. Chem. Inf. Model. 50, 1865–1871 (2010).
Google Scholar
Wang, X., Terashi, G., Christoffer, C. W., Zhu, M. & Kihara, D. Protein docking model evaluation by 3D deep convolutional neural networks. Bioinformatics 36, 2113–2118 (2020).
Google Scholar
Wang, N.-N. et al. ADME properties evaluation in drug discovery: prediction of Caco-2 cell permeability using a combination of NSGA-II and boosting. J. Chem. Inf. Model. 56, 763–773 (2016).
Google Scholar
Ishida, S., Terayama, K., Kojima, R., Takasu, K. & Okuno, Y. AI-driven synthetic route design incorporated with retrosynthesis knowledge. J. Chem. Inf. Model. 62, 1357–1367 (2022).
Google Scholar
Karpov, P., Godin, G. & Tetko, I. V. A transformer model for retrosynthesis. In Artificial Neural Networks and Machine Learning—ICANN 2019: Workshop and Special Sessions (eds Tetko, I. V. et al.) 817–830 (Springer, 2019).
Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, i821–i829 (2018).
Google Scholar
Karimi, M., Wu, D., Wang, Z. & Shen, Y. DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35, 3329–3338 (2019).
Google Scholar
Hassan-Harrirou, H., Zhang, C. & Lemmin, T. RosENet: improving binding affinity prediction by leveraging molecular mechanics energies with an ensemble of 3D convolutional neural networks. J. Chem. Inf. Model. 60, 2791–2802 (2020).
Google Scholar
Feinberg, E. N. et al. PotentialNet for molecular property prediction. ACS Cent. Sci. 4, 1520–1530 (2018).
Google Scholar
Li, Y., Rezaei, M. A., Li, C. & Li, X. DeepAtom: a framework for protein–ligand binding affinity prediction. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 303–310 (IEEE, 2019).
Wang, R., Fang, X., Lu, Y., Yang, C.-Y. & Wang, S. The PDBbind database: methodologies and updates. J. Med. Chem. 48, 4111–4119 (2005).
Google Scholar
Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 35, D198–D201 (2007).
Google Scholar
Hu, L., Benson, M. L., Smith, R. D., Lerner, M. G. & Carlson, H. A. Binding MOAD (Mother Of All Databases). Proteins Struct. Funct. Bioinform. 60, 333–340 (2005).
Google Scholar
Friedrich, N.-O., Simsir, M. & Kirchmair, J. How diverse are the protein-bound conformations of small-molecule drugs and cofactors? Front. Chem. 6, 68 (2018).
Google Scholar
Korlepara, D. B. et al. PLAS-5k: dataset of protein–ligand affinities from molecular dynamics for machine learning applications. Sci. Data 9, 548 (2022).
Google Scholar
Korlepara, D. B. et al. PLAS-20k: extended dataset of protein–ligand affinities from MD simulations for machine learning applications. Sci. Data 11, 180 (2024).
Google Scholar
Yang, J., Shen, C. & Huang, N. Predicting or pretending: artificial intelligence for protein–ligand interactions lack of sufficiently large and unbiased datasets. Front. Pharmacol. 11, 69 (2020).
Google Scholar
Volkov, M. et al. On the frustration to predict binding affinities from protein–ligand structures with deep neural networks. J. Med. Chem. 65, 7946–7958 (2022).
Google Scholar
Vajda, S., Beglov, D., Wakefield, A. E., Egbert, M. & Whitty, A. Cryptic binding sites on proteins: definition, detection, and druggability. Curr. Opin. Chem. Biol. 44, 1–8 (2018).
Google Scholar
Zeng, L. et al. Selective small molecules blocking HIV-1 Tat and coactivator PCAF association. J. Am. Chem. Soc. 127, 2376–2377 (2005).
Google Scholar
Johnson, R. D. III (ed). Computational Chemistry Comparison and Benchmark Database Standard Reference Database Number 101 Release 22 (NIST, accessed 12 Jul 2022); http://cccbdb.nist.gov/
Bista, M. et al. Transient protein states in designing inhibitors of the MDM2–p53 interaction. Structure 21, 2143–2151 (2013).
Google Scholar
Xie, M. et al. Structural basis of inhibition of ERα–coactivator interaction by high-affinity N-terminus isoaspartic acid tethered helical peptides. J. Med. Chem. 60, 8731–8740 (2017).
Google Scholar
Jakalian, A., Jack, D. B. & Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 23, 1623–1641 (2002).
Google Scholar
Dodda, L. S., Vilseck, J. Z., Tirado-Rives, J. & Jorgensen, W. L. 1.14*CM1A-LBCC: localized bond-charge corrected CM1A charges for condensed-phase simulations. J. Phys. Chem. B 121, 3864–3870 (2017).
Google Scholar
Jorgensen, W. L., Maxwell, D. S. & Tirado-Rives, J. Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J. Am. Chem. Soc. 118, 11225–11236 (1996).
Google Scholar
Storer, J. W., Giesen, D. J., Cramer, C. J. & Truhlar, D. G. Class IV charge models: a new semiempirical approach in quantum chemistry. J. Comput. Aided Mol. Des. 9, 87–110 (1995).
Google Scholar
Li, J., Zhu, T., Cramer, C. J. & Truhlar, D. G. New class IV charge model for extracting accurate partial charges from wave functions. J. Phys. Chem. A 102, 1820–1831 (1998).
Google Scholar
Thompson, J. D., Cramer, C. J. & Truhlar, D. G. Parameterization of charge model 3 for AM1, PM3, BLYP, and B3LYP. J. Comput. Chem. 24, 1291–1304 (2003).
Google Scholar
Grimme, S. & Bannwarth, C. Ultra-fast computation of electronic spectra for large systems by tight-binding based simplified Tamm–Dancoff approximation (sTDA-xTB). J. Chem. Phys. 145, 054103 (2016).
Google Scholar
Wang, E. et al. End-point binding free energy calculation with MM/PBSA and MM/GBSA: strategies and applications in drug design. Chem. Rev. 119, 9478–9508 (2019).
Google Scholar
Sun, Z., Liu, Q., Qu, G., Feng, Y. & Reetz, M. T. Utility of B factors in protein science: interpreting rigidity, flexibility, and internal motion and engineering thermostability. Chem. Rev. 119, 1626–1665 (2019).
Google Scholar
Guilligay, D. et al. The structural basis for cap binding by influenza virus polymerase subunit PB2. Nat. Struct. Mol. Biol. 15, 500–506 (2008).
Google Scholar
Rayne, S. & Forest, K. Benchmarking semiempirical, Hartree–Fock, DFT, and MP2 methods against the ionization energies and electron affinities of short- through long-chain [n]acenes and [n]phenacenes. Can. J. Chem. 94, 251–258 (2016).
Google Scholar
Zhan, C.-G., Nichols, J. A. & Dixon, D. A. Ionization potential, electron affinity, electronegativity, hardness, and electron excitation energy: molecular properties from density functional theory orbital energies. J. Phys. Chem. A 107, 4184–4195 (2003).
Google Scholar
Lange, G. et al. Requirements for specific binding of low affinity inhibitor fragments to the SH2 domain of pp60Src are identical to those for high affinity binding of full length inhibitors. J. Med. Chem. 46, 5184–5195 (2003).
Google Scholar
Öster, L., Tapani, S., Xue, Y. & Käck, H. Successful generation of structural information for fragment-based drug discovery. Drug Discov. Today 20, 1104–1111 (2015).
Google Scholar
Heinzlmeir, S. et al. Chemoproteomics-aided medicinal chemistry for the discovery of EPHA2 inhibitors. ChemMedChem 12, 999–1011 (2017).
Google Scholar
Gaieb, Z. et al. D3R Grand Challenge 2: blind prediction of protein–ligand poses, affinity rankings, and relative binding free energies. J. Comput. Aided Mol. Des. 32, 1–20 (2018).
Google Scholar
Whitehouse, A. J. et al. Development of inhibitors against Mycobacterium abscessus tRNA (m1G37) methyltransferase (TrmD) using fragment-based approaches. J. Med. Chem. 62, 7210–7232 (2019).
Google Scholar
Menezes, F. & Popowicz, G. M. ULYSSES: an efficient and easy to use semiempirical library for C. J. Chem. Inf. Model. 62, 3685–3694 (2022).
Google Scholar
Bannwarth, C., Ehlert, S. & Grimme, S. GFN2-xTB—an accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions. J. Chem. Theory Comput. 15, 1652–1671 (2019).
Google Scholar
Dewar, M. J. S., Zoebisch, E. G., Healy, E. F. & Stewart, J. J. P. Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model. J. Am. Chem. Soc. 107, 3902–3909 (1985).
Google Scholar
Stewart, J. J. P. Application of the PM6 method to modeling proteins. J. Mol. Model. 15, 765–805 (2009).
Google Scholar
Sigalov, G., Fenley, A. & Onufriev, A. Analytical electrostatics for biomolecules: beyond the generalized Born approximation. J. Chem. Phys. 124, 124902 (2006).
Google Scholar
Christensen, A. S., Kubař, T., Cui, Q. & Elstner, M. Semiempirical quantum mechanical methods for noncovalent interactions for chemical and biochemical applications. Chem. Rev. 116, 5301–5337 (2016).
Google Scholar
Dixon, S. L. & Merz, K. M. Fast, accurate semiempirical molecular orbital calculations for macromolecules. J. Chem. Phys. 107, 879–893 (1997).
Google Scholar
O’Boyle, N. M. et al. Open Babel: an open chemical toolbox. J. Cheminform. 3, 33 (2011).
Google Scholar
Caldeweyher, E. et al. A generally applicable atomic-charge dependent London dispersion correction. J. Chem. Phys. 150, 154122 (2019).
Google Scholar
Hanwell, M. D. et al. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. J. Cheminform. 4, 17 (2012).
Google Scholar
Case, D. A. et al. Amber 2021 (Univ. of California, San Francisco, 2021).
Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A. & Case, D. A. Development and testing of a general Amber force field. J. Comput. Chem. 25, 1157–1174 (2004).
Google Scholar
Maier, J. A. et al. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput. 11, 3696–3713 (2015).
Google Scholar
Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935 (1983).
Google Scholar
Townshend, R. J. L. et al. ATOM3D: tasks on molecules in three dimensions. Preprint at https://doi.org/10.48550/arXiv.2012.04035 (2022).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. Preprint at https://doi.org/10.48550/arXiv.1609.02907 (2017).
Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682 (2010).
Google Scholar
Forli, S. et al. Computational protein–ligand docking and virtual drug screening with the AutoDock suite. Nat. Protoc. 11, 905–919 (2016).
Google Scholar
Zhao, Y., Stoffler, D. & Sanner, M. Hierarchical and multi-resolution representation of protein flexibility. Bioinformatics 22, 2768–2774 (2006).
Google Scholar
Ravindranath, P. A., Forli, S., Goodsell, D. S., Olson, A. J. & Sanner, M. F. AutoDockFR: advances in protein–ligand docking with explicitly specified binding site flexibility. PLoS Comput. Biol. 11, e1004586 (2015).
Google Scholar
Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995).
Google Scholar
Johnson, B. A. & Blevins, R. A. NMR View: a computer program for the visualization and analysis of NMR data. J. Biomol. NMR 4, 603–614 (1994).
Google Scholar
Siebenmorgen, T. et al. MISATO—machine learning dataset for structure-based drug discovery. Zenodo https://doi.org/10.5281/zenodo.7711953 (2023).
t7morgen/misato-dataset: release for publication. Zenodo https://doi.org/10.5281/zenodo.10926008 (2024).
