Machine learning accurately supports density-functional theoretical model molecules

Almost a third of our supercomputers are spent on molecular modeling. This is a way to test new medicines and ingredients on a computer before making them in the lab. The most accurate approach is the quantum many-body (QMB) equation, which calculates where all the electrons in a material are and how they interact. However, these equations are computationally expensive and are not practical to use.

Computational shortcut, Density-Sensitive Theory (DFT), uses electron density, which is a probability map of the possibility that there may be electrons, and also uses the exchange correlation (XC) function, which summarises how electrons interact. DFT simplifies quantum computation, but no one knows the universal form of XC functionality. Therefore, scientists use approximations based on specific conditions of the material. This approach works to find trends, but is too reliable for accurate and quantitative predictions about molecules and materials.

“We want to bring the accuracy of the QMB method together with the simplicity of DFT,” said Vikram Gavini, a computational scientist at the University of Michigan.SCI. Adv. 2025, doi: 10.1126/sciadv.ady8962).

In a new study, Gavini's team shows how to train machine learning (ML) in QMB data to discover more universal XC features and create a bridge between the two methods. Previous attempts to train ML models to improve XC functionality typically used only the interaction energy of the electrons as training data. In contrast, Gavini's team also includes the possibility to explain how that energy changes at each point in the universe.

“Possibility is a stronger foundation for training because it emphasizes slight differences in systems than energy,” Gavini says. This allows the model to capture subtle changes more effectively.

To test the idea, the researchers used the exact energies and potentials of five atoms and two simple molecules obtained through QMB calculations. They trained the ML model and used this compact dataset to create a new approximation of the XC functionality. When used in DFT calculations, new features provided impressive accuracy. According to Gavini, their models outweigh or matched the widely used XC approximation, while still lowering computational costs.

“This model exceeded a small set of trained atoms and produced accurate results for very different systems,” Gavini said. It was also relatively inexpensive to train, as data from only a few atoms and simple molecules were required. And they avoided fibrous or meaningless results as they were constructed using potentials directly linked to electron behavior. In contrast, many previous ML models only work within narrow datasets, often getting lost from the rules of DFT, limiting their usefulness.

“The trials conducted so far are preliminary, but promising,” says Donald Tuller, a computational chemist at the University of Minnesota who was not involved in the study.

“This model is suitable for light atoms,” says Gavini. “We'd like to expand into solids next.” The team also hopes to pursue higher accuracy by using potential gradients, using potential gradients to train future models.