Generative AI helps design new proteins

AI News



register Listen to this article for free

thank you. Listen to this article in the player above.

Researchers at the University of Toronto have developed an artificial intelligence system that can use generative diffusion to create proteins not found in nature. This is the same technology found in popular imaging platforms such as DALL-E and Midjourney.


This system will help advance the field of generative biology, which promises to speed up drug development by making the design and testing of novel therapeutic proteins more efficient and flexible.

“Our model learns from image representations to generate entirely new proteins at very high rates,” says Philip M. Kim, professor at the Donnelly Center for Cell and Biomolecular Research at the University of T Temerty School of Medicine. says. “All of our proteins look biophysically authentic, which means they fold into configurations that allow them to perform specific functions within the cell.”

today is a diary natural computational science We published our first findings in a peer-reviewed journal. Kim’s lab also released preprints of the model last summer through the open-access server bioRxiv, ahead of his two similar preprints of RF Diffusion by the University of Washington and Chroma by Generate Biomedicines last December. bottom.

Want more breaking news?

apply technology networkOur daily newsletter brings the latest science news straight to your inbox every day.

subscribe for free

Proteins are made up of chains of amino acids that fold into three-dimensional shapes that determine their function. I’m here. With a better understanding of how existing proteins fold, researchers have begun to design folding patterns that are not produced in nature.

But the big challenge, according to Kim, was imagining possible and functional folds. , says Kim, who is also a professor in the Department of Molecular Genetics and Computer Science at the University of T. Protein structures using diffusion methods from imaging space can address this issue. ”

The new system, which the researchers call ProteinSGM, draws from a large set of image-like representations of existing proteins that precisely encode their structures. Researchers feed these images into a generative diffusion model, gradually adding noise until each image is all noise. The model learns how to track how the image noise increases and reverse the process to transform random pixels into sharp images corresponding to completely new proteins.

Jin Sub (Michael) Lee, Ph.D. student in the Kim lab and lead author of the paper, said optimizing this early stage of the image generation process is the biggest challenge in creating ProteinSGM. said to be one of “The key idea was a suitable image-like representation of protein structure so that the diffusion model could learn how to generate new proteins accurately,” says a native of Vancouver who previously had a bachelor’s degree in South Korea. says Mr. Lee, who had a master’s degree in Switzerland. He chose the U of T for his Ph.D.

Validation of proteins produced by ProteinSGM was also difficult. This system produces many structures that are often different from those found in nature. Almost all of them look real by standard metrics, but researchers needed more evidence.

To test the new protein, Lee and his colleagues first turned to OmegaFold, an improved version of DeepMind’s software AlphaFold 2. Both platforms use AI to predict protein structures based on amino acid sequences.

Using OmegaFold, the team confirmed that nearly all novel sequences fold into the desired and novel protein structures. A smaller number was then selected to be physically created in a test tube to ensure that the structure was a protein and not just a vague array of chemical compounds.

“The agreement at OmegaFold and experimental testing in the lab gave us confidence that these were properly folded proteins. It was amazing,” says Lee.

Based on this research, next steps include further development of ProteinSGM for antibodies and other proteins with the greatest therapeutic potential, says Kim. “This will be a very exciting area for research and entrepreneurship,” he adds.

Lee says he hopes that generative biology will move towards co-designing protein sequences and structures, including the side-chain conformations of proteins. Most research to date has focused on the generation of the backbone, the main chemical structure that holds proteins together.

“Side-chain configurations ultimately determine protein function. Engineering them means an exponential increase in complexity, but it could be possible with proper engineering.” says Lee. “We want to find out.”

reference: Lee J.S., Kim J., and Kim P.M.Score-Based Generative Modeling for de novo protein design. Nut Computational Science2023:1-11. Doi: 10.1038/s43588-023-00440-3

This article is reprinted from materialNote: The length and content of the material may have been redacted. Please contact the citation source for details.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *