New protein folding AI significantly extends Alphafold’s efforts

AI News


New protein folding AI predicts structures of 1 billion proteins

New open-source atlas generated by an AI tool called ESMFold2 significantly expands the world of known proteins

3D computer-generated model of cytotoxic T lymphocyte-associated protein 4.

The AI ​​tool designed a binder to cytotoxic T lymphocyte-associated protein 4 (CTLA-4).

Science Photo Library/Alamy

The world of known proteins has grown even larger. Newly released artificial intelligence tools have generated over a billion predicted protein structures and an atlas of billions more protein sequences.

The database, known as the ESM Atlas, was released today by researchers at the Chan Zuckerberg Initiative’s BioHub, a biomedical research institute founded in San Francisco, California, by Facebook founder Mark Zuckerberg and his wife, physician and educator Priscilla Chan.

This atlas exceeds the number of entries in the AlphaFold database of predicted protein structures by more than 800 million and exceeds the previous ESM atlas by approximately 300 million.


About supporting science journalism

If you enjoyed this article, please consider supporting our award-winning journalism. Currently subscribing. By subscribing, you help ensure future generations of influential stories about the discoveries and ideas that shape the world today.


The predictions were made using ESMFold2, an AI model that Biohub says outperforms AlphaFold3, the latest version of Google DeepMind’s system, and other protein structure prediction AIs. The atlas is described in a preprint released today.

“What this atlas does is show the whole picture of protein biology, especially the parts that are least known,” said Alex Rives, director of Biohub science, who led the effort. “We think this will be a very powerful substrate for discovering new biology.”

Other scientists are impressed by the results, especially that ESMFold2 is completely open source. However, biohub models are entering an increasingly crowded field, with competing open source and proprietary protein models advancing at breakneck speed.

Antibody prediction

ESMFold2 is based on a “protein language” model that Rives’ team published in 2024 and was trained on billions of proteins from across the tree of life. It includes “metagenomic” sequences from soil, ocean, and other environments that are not present in the AlphaFold database of predicted protein structures.

Rives’ team says ESMFold2 is superior to existing methods, including AlphaFold3, in determining the precise structure of interacting protein complexes, such as antibody molecules that bind to antigenic molecular targets.

In the preprint, researchers describe how they used ESMFold2 to engineer new antibodies and other proteins that can bind strongly to proteins involved in cancer and immunological conditions. When created and tested in the lab, most designs worked as expected.

Rives’ team used this tool to create an atlas containing information on 1.1 billion predicted protein structures and 6.8 billion protein sequences. Most of these come from poorly characterized metagenomic sequences. Rives hopes the freely accessible atlas will help scientists connect the known and unknown parts of the protein world. Researchers used the atlas to discover structural similarities between CRISPR microbial defense proteins and gene-editing proteins identified in soil fungi in 2023 and found in other eukaryotic species.

Supplementary database

Gemma Atkinson, a computational biologist at Sweden’s Lund University, said the newly published atlas should be “a great resource for biology”. “It will be interesting to see how large-scale protein language models can capture the fundamental rules of protein biology.”

Computational biologist Christine Orengo from University College London said the predictions would need to be evaluated first, but could help reveal new protein folds and functions, with implications for protein design and fundamental understanding of biology.

Martin Steinegger, a computational biologist at Seoul National University, said the biggest question is how accurately ESMFold2 can predict the structure of proteins that are significantly different from those already known. His team found that the first version of ESMFold was not particularly good at predicting unusual protein structures, especially those found in metagenomic data.

Sergey Ovchinnikov, a computational biologist at the Massachusetts Institute of Technology in Cambridge, sees the ESM atlas as a supplement to, rather than a replacement for, the widely used AlphaFold database, which contains more than 200 million protein structures.

ESMFold2’s predictions of interacting proteins are impressive, but not all that surprising, Ovchinnikov adds. Earlier this year, Isomorphic Labs, a Google DeepMind biopharmaceutical spinoff, announced its own model that made significant progress in predicting such structures. Open-source models that the Biohub team did not directly compare with ESMFold2 also achieved impressive results in predicting protein interactions, Ovchinnikov says.

ESMFold2 is completely open source and has no restrictions on commercial use, so it has the potential to be widely used, Ovchinnikov said. “Many people will want to try ESMFold2.”

This article is reprinted with permission. first published May 27, 2026.

It’s time to stand up for science

If you liked this article, please support us. scientific american has served as a champion of science and industry for 180 years, and now may be the most important moment in its two-century history.

I scientific american I’ve been a subscriber since I was 12 years old, and it’s helped shape the way I see the world. siam It always educates me, entertains me, and leaves me in awe of our vast and beautiful universe. I hope that’s the case for you too.

If you Subscribe scientific americanhelp us keep our coverage focused on meaningful research and discovery. Having the resources to report on decisions that threaten laboratories across the United States. And at a time when the value of science itself is often not recognized, we support both budding and working scientists.

In return, you get important news. Engaging podcasts, great infographics, Newsletters you can’t miss, videos you can’t miss, Challenging games, and the best writing and reporting in science. you can too Gift a subscription to someone.

There has never been a more important time for us to stand up and show why science matters. We hope you will support us in that mission.



Source link