A team of researchers at Baidu Research has developed an AI algorithm that can rapidly design previously unattainable highly stable COVID-19 mRNA vaccine sequences. Dubbed LinearDesign, the algorithm made a giant leap in both vaccine sequence stability and efficacy, resulting in a 128-fold increase in antibody responses in the COVID-19 vaccine.
Dr. He Zhang, staff software engineer at Baidu Research, said:
A joint study with Oregon State University, StemiRNA Therapeutics, and the University of Rochester Medical Center published a study titled “Optimized mRNA Design Algorithm Improves Stability and Immunogenicity.” Nature.
In this paper, we take a classical approach from natural language processing (NLP) and use the elegant and simple solutions employed to understand words and grammars to show how we can solve complex biological problems. We are clarifying what we can work on.
mRNA, or messenger RNA, has emerged as a revolutionary technology for vaccine development and potential treatments for cancer and other diseases. mRNA serves as the key messenger that carries genetic instructions from DNA to the protein production machinery of the cell, enabling the creation of specific proteins for various functions in the human body. Due to its many advantages in safety, efficacy, and production, mRNA has been rapidly adopted in the process of COVID-19 vaccine development.
However, the natural instability of mRNA leads to poor protein expression, weakening the vaccine’s ability to stimulate a strong immune response. This instability also poses challenges for the storage and transportation of mRNA vaccines, especially in developing countries where resources are often limited.
Previous studies have shown that optimizing mRNA secondary structure stability, combined with optimal codons, improves protein expression. The challenge lies in the incredibly vast mRNA design space due to synonymous codons.For example, there are about 10632 mRNAs that can be translated into the same SARS-CoV-2 spike protein present challenges that previous methods cannot overcome.
Although NLP and biology may seem unrelated at first glance, there is a powerful mathematical connection between the two fields. In human languages, a sentence consists of a sequence of words and an underlying syntactic tree containing noun and verb phrases that together convey meaning. Similarly, an RNA strand has a secondary structure associated with its nucleotide sequence based on its folding pattern.
The researchers used a language processing technique called lattice analysis. It represents potential word connections in a grid graph and selects the most plausible option based on grammar. Similarly, we used a deterministic finite-state automaton (DFA) to construct graphs that compactly represent all mRNA candidates. Applying lattice parsing to an mRNA to find the best fit is akin to identifying the most likely sentence among a variety of similar-sounding alternatives.
Using this approach, LinearDesign takes only 11 minutes to generate the most stable mRNA sequence encoding the Spike protein.
In direct comparison, sequences designed by LinearDesign showed significantly improved results compared to existing vaccine sequences. For the COVID-19 mRNA vaccine sequence, the algorithm achieved up to a 5-fold increase in stability (mRNA half-life), a 3-fold increase in protein expression levels (within 48 hours), and a staggering 128-fold increase. antibody response. For VZV mRNA vaccine sequences, this study reported up to a 6-fold increase in stability (mRNA molecule half-life), a 5.3-fold increase in protein expression levels (48 hours), and an 8-fold increase in antibody response. .
“Vaccines designed with our method may provide better protection at the same dose, and may provide comparable protection at a lower dose, and have fewer side effects. , which will greatly reduce vaccine R&D costs for biopharmaceutical companies and improve outcomes,” Dr. Zhang added. In 2021, Baidu and Sanofi launched a partnership to integrate the LinearDesign algorithm into Sanofi’s product design pipeline for mRNA vaccine and drug development.
Baidu has created a biocomputing platform based on PaddlePaddle called PaddleHelix. This covers the ERNIE-Bio-Computing Big Models. This platform explores applications of AI in various fields such as small molecules, proteins/peptides, and RNA, providing a new research paradigm for AI in life sciences. Baidu’s ERNIE Big Model has developed a comprehensive big model technology system covering NLP, vision, cross-modal and bio-computing. The recently announced ERNIE Bot is a knowledge-enhanced large-scale language model (LLM) that can understand and generate human language and is part of the ERNIE Big Model family.
Baidu will continue to explore AI applications in life sciences, expand the scope and depth of comprehensive technology, and defend the health and well-being of all mankind.
For more information:
He Zhang et al., Algorithms for optimized mRNA design improve stability and immunogenicity. Nature (2023). DOI: 10.1038/s41586-023-06127-z
Journal information:
Nature
Courtesy of Baidu Research
