Computational biology, chemistry, and materials engineering rely on the ability to predict the time evolution of matter at the atomic scale. Quantum mechanics governs atomic and electronic vibrations, movements and bond dissociations at a small level, whereas the phenomena that govern observable physical and chemical processes often occur on considerably longer timescales. . Bridging these sizes requires innovations in both highly parallelizable architectures accessible to exascale processors and rapid and highly accurate computational methods to capture quantum interactions. Current computational approaches cannot investigate the structural complexity of realistic physical and chemical systems. Also, their observable evolutionary period is too long for atomistic simulations.
There has been a lot of research on MLIP (Machine Learning Interatomic Potentials) in the last 20 years. The energies and forces learned from high-precision reference data are used to power MLIPs that scale linearly with the number of atoms. Early attempts used Gaussian processes or simple neural networks in combination with manually created descriptors. Early on, his MLIP failed to generalize to data structures not present in training, leading to poor prediction accuracy and fragile simulations that could not be used elsewhere.
New research from Harvard Labs demonstrates that a biomolecular system containing as many as 44 million atoms can be modeled with Allegro to SOTA accuracy. The team used large pretrained Allegro models for systems with atomic numbers ranging from 23,000 for DHFR to 91,000 for factor IX, 400,000 for cellulose, 44,000,000 for HIV capsid, and over 100,000 for other systems. used. A pretrained Allegro model with 8 million weights is used, and a forced error of only 26 meV/A is achieved by training on 1 million structures with hybrid feature accuracy on the fantastic SPICE dataset. The potential to learn the entire set of inorganic materials and organic molecules at this data scale enables fast exascale simulations of a previously unimaginable set of material systems. This is a very large and powerful model with 8 million weights.
To do active learning for automatic construction of the training set, researchers showed that the uncertainty of deep equivariant model predictions of forces and energies can be efficiently quantified. Since the equivariate model is accurate, the accuracy bottleneck now lies in the quantum electronic structure computation required to train MLIP. Gaussian mixture models are easily adapted in Allegro, allowing a single model to be used instead of an ensemble to perform simulations that account for large uncertainties.
Allegro is the only scalable approach that surpasses traditional message-passing and transformer-based designs. We demonstrate top speeds of over 100 steps per second across a variety of large-scale systems, with results scaling up to over 100 million atoms. Even at the massive 44-million-atom scale of the HIV capsid, the defects are generally fairly obvious, and the simulations are stable for more than a nanosecond out of the box. The team had few issues throughout the production.
To better understand the dynamics of large biomolecular systems and the atomic-level interactions between proteins and pharmaceuticals, the team hopes their work will open new avenues in biochemistry and drug discovery. .
check out paperdon’t forget to join Our 20k+ ML SubReddit, cacophony channeland email newsletterWe share the latest AI research news, cool AI projects, and more. If you have any questions about the article above or missed something, feel free to email me. Asif@marktechpost.com
🚀 Check out 100 AI Tools in the AI ​​Tools Club
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree at the Indian Institute of Technology (IIT), Bhubaneswar. She is a data her science enthusiast and has a keen interest in the scope of artificial intelligence applications in various fields. Her passion lies in exploring new advancements in technology and its practical applications.
