Deep Learning in Protein Engineering: Designing Functional Soluble Proteins

Machine Learning


Traditional protein design often relies on physics-based methods such as Rosetta, which face the challenge of creating functional proteins with complex structures due to the need for parametric and symmetry constraints. Recent advances in deep learning, particularly tools such as AlphaFold2, have revolutionized protein design by enabling accurate prediction and exploration of vast sequence space. This has resulted in stable proteins with novel functions and complex structures. However, designing large and complex protein folds remains challenging, especially folds that mimic soluble forms of membrane proteins. Understanding and expanding the fold space to include soluble analogs of membrane proteins may unlock new functional possibilities for synthetic proteins.

Researchers from multiple institutions, including EPFL and the University of Washington, have developed a deep learning pipeline for designing soluble analogs of complex protein folds and membrane proteins. The approach uses AlphaFold2 and ProteinMPNN to create stable protein structures, including those that mimic membrane proteins such as GPCRs, without parameter constraints or extensive experimental optimization. Biophysical analysis confirmed the high stability of the designs, and experimental structures showed remarkable accuracy. The method may expand the functional soluble fold space and enable the incorporation of membrane protein features, advancing drug discovery and other applications.

The researchers developed a deep learning-based pipeline that integrates AF2seq and ProteinMPNN to design complex protein folds, including soluble analogs of membrane proteins. AF2seq generates sequences to adopt target protein topologies, and ProteinMPNN optimizes them to increase diversity and solubility. This approach allowed them to design complex structures, such as IGF, β-barrel, and TIM barrel, without traditional parametric constraints. Experimental validation showed high stability and accurate structural alignment with the developed models. The success of this pipeline highlights its potential for exploring new protein topologies and integrating membrane protein functionality, as well as advancing drug discovery and protein engineering.

The researchers sought to design soluble analogs of membrane protein folds that typically have unique structural features. Using the AF2seq-MPNN pipeline, they aimed to solubilize complex folds such as claudins, rhomboid proteases, and GPCRs. Initial attempts using standard methods failed, but by retraining ProteinMPNN on soluble proteins (MPNNsol), they were successful in designing them. The researchers achieved soluble and thermally stable proteins with accurate structural alignment for these challenging folds. High-resolution X-ray crystallography confirmed the accuracy of the design and demonstrated that these membrane topologies can be converted into soluble forms, revealing potential for a variety of biotechnological applications.

In this study, we extended the design of soluble analogs of membrane proteins to functional capabilities. The researchers created soluble versions of human claudin-1 and claudin-4 that preserve specific functional motifs while solubilizing the transmembrane segments, mimicking their membrane-bound forms and retaining their natural ability to bind Clostridium perfringens enterotoxin. They also designed chimeric soluble GPCR analogs incorporating functional domains of the ghrelin receptor and the adenosine A2A receptor. These analogs were able to engage in specific protein interactions, indicating that key functional sites are preserved. This approach has the potential to advance functional protein design and therapeutic discovery.

This work introduces a deep learning-based computational method for designing complex protein folds, overcoming previous challenges. We successfully generated high-quality protein backbones across a range of topologies without fold-specific retraining, with great experimental success in generating soluble and properly folded designs. Structural validation confirmed precise modeling accuracy essential for functional protein design. Importantly, the method extended design capabilities to membrane protein analogues containing complex folds such as rhomboid proteases and GPCRs, demonstrating solubility and monomeric states in solution. This breakthrough paves the way for creating soluble proteins with native functionality, essential for accelerating drug discovery targeting membrane proteins, significantly broadening the scope of computational protein design.


Please check paper. All credit for this work goes to the researchers of this project. Also, don't forget to follow us: twitter.

participate Telegram Channel and LinkedIn GroupsUp.

If you like our work, you will love our Newsletter..

Please join us 46k+ ML Subreddit

Sana Hassan, a Consulting Intern at Marktechpost and a dual degree student at Indian Institute of Technology Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, she brings a fresh perspective to the intersection of AI and real-world solutions.

🐝 Join the fastest growing AI research newsletter, read by researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft & more…





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *