Machine learning tools help accelerate MOF discovery

MOFs are an emerging class of materials that form microscopic sponge-like structures with vast internal surface areas. Its qualities promise to transform the way society captures, absorbs and filters substances at the molecular level. Researchers say this could improve battery chemistry, make carbon capture more efficient and improve access to clean water.

But scientists are faced with a choice problem. MOFs are highly modular and consist of metal ion nodes and organic molecules that connect the nodes into a large network. Researchers say there are trillions of possible chemical combinations. However, not all combinations are equally useful, and some are impossible to perform in the laboratory.

Now, a team led by Adji Bousso Dieng has developed a way to use machine learning to predict which MOF structures are good candidates, avoiding the need for researchers to wade through countless useless structures.

“Our tool takes seconds to make predictions, compared to seven hours to more than two days with traditional molecular simulations,” said Dieng, an associate professor and assistant professor of computer science at the Princeton Institute for Materials Research.

Specifically, the team made their predictions using a measure of the stability of a molecular structure called free energy.

They published a paper detailing their work in the Journal of the American Chemical Society on December 16th.

Developing this tool required three major steps. First, the key physical and chemical properties of the MOFs had to be translated into machine-readable sequences. Next, we built a database of MOFs to train the model. Finally, the model was run multiple times to make predictions for specific materials.

Translating the physical and chemical properties related to the free energies of individual atoms and units of MOFs into machine-readable representations is difficult and was key to this study.

“The sequence representation we came up with really unlocked everything,” Dieng says.

Using this system, they generated representations of 1 million MOFs.

The team then trained a custom language model to predict free energy values for all 1 million MOFs in the database, tuning it using simpler properties closely related to free energy. When testing a sample of the database using approximately 65,000 materials with known free energy values, the model's predictions were accurate 97% of the time.

Dieng's co-investigator Diego Gómez Galdron of the Colorado School of Mines had previously determined that below a certain free energy value (4.4 kilojoules per mole), MOFs are considered stable and can be synthesized in the laboratory.

“If we have a new MOF, we can predict its free energy and whether it can be synthesized or not,” Dieng says.

The team is currently working on streamlining the sequence representation and reducing the computational overhead incurred by some structures. They are also using that system to add a search feature to their tool to help find stable MOFs.

“We've solved the problem and are now able to compute the sequence representation itself very quickly and very cheaply,” Dieng said. “This technology allows researchers to focus resources on promising candidates for practical applications in carbon capture, energy storage, catalysis, and gas separation.”

reference: Niyongabo-Lubungo A, Fajardo-Rojas F, Gómez-Gardrón DA, Dieng AB. Accurate and fast prediction of MOF free energy using machine learning. J Am Kem Soc. 2025;147(52):48035-48045. doi: 10.1021/jacs.5c13960

This article has been reprinted from the following material: Note: Materials may be edited for length and content. Please contact the citation source for details. You can access our press release publishing policy here.

Source link