Advanced algorithms generate customized “molecular fingerprints”

Machine Learning


This article has been reviewed in accordance with Science X's editorial processes and policies. The editors have highlighted the following attributes while ensuring the authenticity of the content:

fact confirmed

Peer-reviewed publications

trusted sources

proofread


Graphical abstraction. credit: chemistry (2024). DOI: 10.1016/j.chempr.2024.02.004

× close


Graphical abstraction. credit: chemistry (2024). DOI: 10.1016/j.chempr.2024.02.004

A team led by Professor Frank Glorias from the Institute of Organic Chemistry at the University of Münster has developed an evolutionary algorithm that identifies structures within molecules that are particularly relevant to each question and uses them to encode properties of different molecules. Did. Machine learning model.

This method is also suitable for mechanistic prediction of quantum chemical properties and toxicity of molecules. It can be applied to any molecular dataset and requires no expert knowledge of the underlying relationships.

Artificial intelligence and machine learning are becoming increasingly relevant to everyday life, and the same is true for chemistry. For example, organic chemists are interested in how machine learning can help discover and synthesize new molecules that are effective against disease or helpful in other ways.

The new algorithm developed by Glorius' team uses mechanisms such as reproduction, mutation, and selection to search for optimal molecular representations based on evolutionary principles. Depending on the model and the questions asked, a customized “molecular fingerprint” was created, which chemists used in research to predict chemical reactions with surprising accuracy.

Methods published in magazines chemistryit is also suitable for predicting quantum chemical properties and toxicity of molecules.

To use machine learning, researchers must first convert molecules into a computer-readable format. Many research groups are already working on this problem, and as a result there are different ways to perform this task. However, it is difficult to predict which of the available methods will be best for answering a particular question, such as determining whether a certain compound is harmful to humans.

The new algorithm is designed to find the optimal molecular fingerprint in each case. To do this, the algorithm gradually selects from a large number of randomly generated molecular fingerprints the one that gives the best result in prediction.

“Following natural examples, we exploit mutations, i.e. random changes to individual components of a fingerprint, or the recombination of two fingerprint components,” says PhD student Felix Katzenburg. I will explain.

“In other studies, molecules are often described by quantifiable properties chosen and calculated by humans,” Glorius adds.

“The algorithm we developed automatically identifies relevant molecular structures, so there is no systematic bias introduced by human experts.”

Another benefit is that the encoding method allows you to understand why the model makes certain predictions. For example, conclusions can be drawn about which parts of the molecule have a positive or negative impact on predicting how a reaction will unfold, allowing researchers to modify relevant structures in a targeted manner.

The Münster team realized that new methods did not always yield optimal results.

“Other methods, such as neural networks, may have an advantage, especially when significant human expertise is expended in selecting relevant molecular properties or when very large amounts of data are available. ,” says Katzenberg.

However, one of the main goals of this work was to develop a method for encoding molecules that can be applied to any molecular dataset and that does not require expert knowledge of the underlying relationships.

For more information:
Philipp M. Pflüger et al., Evolutionary algorithms for interpretable molecular representations, chemistry (2024). DOI: 10.1016/j.chempr.2024.02.004

Magazine information:
chemistry



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *