Deep-learning structure elucidation from single-mutant deep mutational scanning

Machine Learning


  • Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bertoline, L. M. F., Lima, A. N., Krieger, J. E. & Teixeira, S. K. Before and after AlphaFold2: An overview of protein structure prediction. Front. Bioinforma. 3, 1120370 (2023).

    Google Scholar 

  • Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    MathSciNet 
    CAS 
    PubMed 

    Google Scholar 

  • Wu, R. et al. High-resolution de novo structure prediction from primary sequence. bioRxiv. 2022.2007.2021.500999 (2022). https://doi.org/10.1101/2022.07.21.500999

  • Huang, B. et al. Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms. Genomics, Proteom. Bioinforma. 21, 913–925 (2023).

    CAS 

    Google Scholar 

  • Perrakis, A. & Sixma, T. K. AI revolutions in biology. EMBO Rep. 22, e54046 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Peng, C.-X. et al. Recent Advances and Challenges in Protein Structure Prediction. J. Chem. Inf. Modeling 64, 76–95 (2024).

    CAS 

    Google Scholar 

  • Nussinov, R., Zhang, M., Liu, Y. & Jang, H. AlphaFold, Artificial Intelligence (AI), and Allostery. J. Phys. Chem. B 126, 6372–6383 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Kahraman, A. et al. Cross-Link Guided Molecular Modeling with ROSETTA. PLOS ONE 8, e73411 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Aprahamian, M. L., Chea, E. E., Jones, L. M. & Lindert, S. Rosetta Protein Structure Prediction from Hydroxyl Radical Protein Footprinting Mass Spectrometry Data. Anal. Chem. 90, 7721–7729 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Eschweiler, J. D., Farrugia, M. A., Dixit, S. M., Hausinger, R. P. & Ruotolo, B. T. A Structural Model of the Urease Activation Complex Derived from Ion Mobility-Mass Spectrometry and Integrative Modeling. Structure 26, 599–606.e593 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Seffernick, J. T., Harvey, S. R., Wysocki, V. H. & Lindert, S. Predicting Protein Complex Structure from Surface-Induced Dissociation Mass Spectrometry Data. ACS Cent. Sci. 5, 1330–1341 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Hauri, S. et al. Rapid determination of quaternary protein structures in complex biological samples. Nat. Commun. 10, 192 (2019).

    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhang, M. M. et al. An Integrated Approach for Determining a Protein–Protein Binding Interface in Solution and an Evaluation of Hydrogen–Deuterium Exchange Kinetics for Adjudicating Candidate Docking Models. Anal. Chem. 91, 15709–15717 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Sala, D. et al. Protein structure prediction assisted with sparse NMR data in CASP13. Proteins: Struct., Funct., Bioinforma. 87, 1315–1332 (2019).

    CAS 

    Google Scholar 

  • Marzolf, D. R., Seffernick, J. T. & Lindert, S. Protein Structure Prediction from NMR Hydrogen-Deuterium Exchange Data. J. Chem. Theory Comput 17, 2619–2629 (2021).

    CAS 
    PubMed 

    Google Scholar 

  • Biehn, S. E. & Lindert, S. Accurate protein structure prediction with hydroxyl radical protein footprinting data. Nat. Commun. 12, 341 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Biehn, S. E., Limpikirati, P., Vachet, R. W. & Lindert, S. Utilization of Hydrophobic Microenvironment Sensitivity in Diethylpyrocarbonate Labeling for Protein Structure Prediction. Anal. Chem. 93, 8188–8195 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Jeliazkov, J. R., Frick, R., Zhou, J. & Gray, J. J. Robustification of RosettaAntibody and Rosetta SnugDock. PLOS ONE 16, e0234282 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Gaalswyk, K., Liu, Z., Vogel, H. J. & MacCallum, J. L. An Integrative Approach to Determine 3D Protein Structures Using Sparse Paramagnetic NMR Data and Physical Modeling. Front. Mol. Biosci. 8, 676268 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Nguyen, T. T., Marzolf, D. R., Seffernick, J. T., Heinze, S. & Lindert, S. Protein structure prediction using residue-resolved protection factors from hydrogen-deuterium exchange NMR. Structure 30, 313–320.e313 (2022).

    CAS 
    PubMed 

    Google Scholar 

  • Seffernick, J. T. et al. Simulation of Energy-Resolved Mass Spectrometry Distributions from Surface-Induced Dissociation. Anal. Chem. 94, 10506–10514 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Turzo, S. et al. Protein shape sampled by ion mobility mass spectrometry consistently improves protein structure prediction. Nat. Commun. 13, 4377 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Drake, Z. C., Seffernick, J. T. & Lindert, S. Protein complex prediction using Rosetta, AlphaFold, and mass spectrometry covalent labeling. Nat. Commun. 13, 7846 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Khaje, N. A. et al. Validated determination of NRG1 Ig-like domain structure by mass spectrometry coupled with computational modeling. Commun. Biol. 5, 452 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Bolz, R. M. et al. Energy Resolved Mass Spectrometry Data from Surfaced Induced Dissociation Improves Prediction of Protein Complex Structure. Anal. Chem. 97, 2375–2383 (2025).

    CAS 
    PubMed 

    Google Scholar 

  • Stahl, K., Graziadei, A., Dau, T., Brock, O. & Rappsilber, J. Protein structure prediction with in-cell photo-crosslinking mass spectrometry and deep learning. Nat. Biotechnol. 41, 1810–1819 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zhou, X. et al. I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat. Protoc. 17, 2326–2353 (2022).

    CAS 
    PubMed 

    Google Scholar 

  • Terwilliger, T. C. et al. Improved AlphaFold modeling with implicit experimental information. Nat. Methods 19, 1376–1382 (2022).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Wei, H. & Li, X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 14, 1087267 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Notin, P. et al. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. bioRxiv https://doi.org/10.1101/2023.12.07.570727 (2023).

  • Riesselman, A. J., Ingraham, J. B. & Marks, D. S. Deep generative models of genetic variation capture the effects of mutations. Nat. Methods 15, 816–822 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Laine, E., Karami, Y. & Carbone, A. GEMME: A Simple and Fast Global Epistatic Model Predicting Mutational Effects. Mol. Biol. Evolution 36, 2604–2619 (2019).

    CAS 

    Google Scholar 

  • Schmiedel, J. M. & Lehner, B. Determining protein structures using deep mutagenesis. Nat. Genet 51, 1177–1186 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Rollins, N. J. et al. Inferring protein 3D structure from deep mutation scans. Nat. Genet. 51, 1170–1176 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Sahoo, A., Khare, S., Devanarayanan, S., Jain, P. C. & Varadarajan, R. Residue proximity information and protein model discrimination using saturation-suppressor mutagenesis. eLife 4, e09532 (2015).

    PubMed 
    PubMed Central 

    Google Scholar 

  • Adkar, B. haratV. et al. Protein Model Discrimination Using Mutational Sensitivity Derived from Deep Sequencing. Structure 20, 371–381 (2012).

    CAS 
    PubMed 

    Google Scholar 

  • Zutz, A. et al. A dual-reporter system for investigating and optimizing protein translation and folding in E. coli. Nat. Commun. 12, 6093 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Stiffler, M. A. et al. Protein Structure from Experimental Evolution. Cell Syst. 10, 15–24.e15 (2020).

    CAS 
    PubMed 

    Google Scholar 

  • Tsuboyama, K. et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 620, 434–444 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Banach, M., Fabian, P., Stapor, K., Konieczny, L. & Roterman, A. I. Structure of the Hydrophobic Core Determines the 3D Protein Structure-Verification by Single Mutation Proteins. Biomolecules 10, 767 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • van der Kamp, M. W. & Daggett, V. Pathogenic Mutations in the Hydrophobic Core of the Human Prion Protein Can Promote Structural Instability and Misfolding. J. Mol. Biol. 404, 732–748 (2010).

    PubMed 
    PubMed Central 

    Google Scholar 

  • Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Gustaf, A. et al. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. bioRxiv, 2022.2011.2020.517210 https://doi.org/10.1101/2022.11.20.517210 (2022).

  • Bæk, K. T. & Kepp, K. P. Assessment of AlphaFold2 for Human Proteins via Residue Solvent Exposure. J. Chem. Inf. Modeling 62, 3391–3400 (2022).

    Google Scholar 

  • Escobedo, N. et al. Revealing Missing Protein–Ligand Interactions Using AlphaFold Predictions. J. Mol. Biol. 436, 168852 (2024).

    CAS 
    PubMed 

    Google Scholar 

  • Bhasin, M. & Varadarajan, R. Prediction of Function Determining and Buried Residues Through Analysis of Saturation Mutagenesis Datasets. Front. Mol. Biosci. 8, 635425 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Chen, J. et al. Topological deep learning based deep mutational scanning. Computers Biol. Med. 164, 107258 (2023).

    Google Scholar 

  • Wang, G. & Dunbrack, R. L. Jr. PISCES: a protein sequence culling server. Bioinformatics 19, 1589–1591 (2003).

    CAS 
    PubMed 

    Google Scholar 

  • Dieckhaus, H., Brocidiacono, M., Randolph, N. Z. & Kuhlman, B. Transfer learning to leverage larger datasets for improved prediction of protein stability changes. Proc. Natl Acad. Sci. 121, e2314853121 (2024).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Pereira, J. et al. High-accuracy protein structure prediction in CASP14. Proteins 89, 1687–1699 (2021).

    CAS 
    PubMed 

    Google Scholar 

  • Robin, X. et al. Continuous Automated Model EvaluatiOn (CAMEO)—Perspectives on the future of fully automated evaluation of structure prediction methods. Proteins: Struct., Funct., Bioinforma. 89, 1977–1986 (2021).

    CAS 

    Google Scholar 

  • Hill, C. H. et al. Structural and molecular basis for Cardiovirus 2A protein as a viral gene expression switch. Nat. Commun. 12, 7166 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Plaxco, K. W., Simons, K. T. & Baker, D. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277, 985–994 (1998).

    CAS 
    PubMed 

    Google Scholar 

  • Topolska, M., Beltran, A. & Lehner, B. Deep indel mutagenesis reveals the impact of amino acid insertions and deletions on protein stability and function. bioRxiv, 2023.2010.2006.561180 https://doi.org/10.1101/2023.10.06.561180 (2024).

  • Starr, T. N. et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell 182, 1295–1310.e1220 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).

    CAS 
    PubMed 

    Google Scholar 

  • Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res 28, 235–242 (2000).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Aprahamian, M. L. & Lindert, S. Utility of Covalent Labeling Mass Spectrometry Data in Protein Structure Prediction with Rosetta. J. Chem. Theory Comput 15, 3410–3424 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xu, D., Li, H. & Zhang, Y. Protein depth calculation and the use for improving accuracy of protein fold recognition. J. Comput Biol. 20, 805–816 (2013).

    MathSciNet 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Gong, J. et al. THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model. Bioinformatics 39, btad646 (2023).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Ohio Supercomputer Center. Ohio Supercomputer Center. Columbus OH: Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73 (1987).

  • Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).

    CAS 
    PubMed 

    Google Scholar 

  • Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Drake, Z. et al. Deep-Learning Structure Elcudiation from Single-Mutant Deep Mutational Scanning, Zenodo, https://doi.org/10.5281/zenodo.15793742, (2025).



  • Source link

    Leave a Reply

    Your email address will not be published. Required fields are marked *