Artificial intelligence has exploded across our news feeds, bringing ChatGPT and related AI technologies to the broader public eye. Beyond popular chatbots, biologists are exploring ways to use AI to investigate core gene functions.
Previously, researchers at the University of California, San Diego, who study the DNA sequences that turn on genes, used artificial intelligence to unravel the mysteries linked to gene activation, a fundamental process involved in growth, development, and disease. Identified the puzzle pieces. Professor James T. Kadonaga of the School of Biological Sciences and his colleagues have used machine learning, a form of artificial intelligence, to identify downstream core promoter regions, the “gateway” DNA activation codes responsible for the operation of up to one-third of cells. (DPR) was discovered. our genes.
Building on this finding, Kadonaga and researchers Long Vo ngoc and Torrey E. Rhyne used machine learning to develop a “synthetic extreme” with specifically designed functions in gene activation. identified the DNA sequence. The researchers, published in the journal Genes & Development, tested millions of different DNA sequences through machine learning (AI) by comparing DPR gene activating elements in humans and Drosophila (Drosophila). Using AI, we were able to find custom-made rare DPR sequences that were active in humans but not in Drosophila, and vice versa. More generally, this approach can now be used to identify synthetic DNA sequences with activity that could be useful in biotechnology and medicine.
“In the future, this strategy may be exploited to identify synthetic extreme DNA sequences with practical and useful applications. Compare humans (condition X) and Drosophila (condition Y).” Instead, we can test the ability of drug A (condition X) but not drug B (condition Y) to activate the gene,” said Kadonaga, a prominent professor in the department. Majored in Molecular Biology. “This method can also be used to find custom-made DNA sequences that activate genes in tissue 1 (condition X) but not in tissue 2 (condition Y). “The potential applications are countless. Synthesized extreme DNA sequences can be very rare, perhaps one in a million. If they exist, we can use AI to find them.”
Machine learning is a branch of AI in which computer systems continuously improve and learn based on data and experience. In a new study, Kadonaga, Vo ngoc (former postdoctoral fellow at the University of California, San Diego, now Velia Therapeutics) and Rhyne (staff researcher) used a technique known as support vector regression to analyze the well-established 20 “Trained” a machine learning model on 10,000 DNA sequences. Based on real laboratory experimental data. These are the targets presented as examples for machine learning systems. He then “fed” 50 million test DNA sequences to human and Drosophila machine learning systems and compared the sequences to ask him to identify unique sequences within the two giant datasets. .
A machine learning system showed that the human and Drosophila sequences are nearly redundant, but the researchers found a rare instance in which gene activation was highly active in humans but not in Drosophila. We focused on the core question of whether AI models can discriminate. The answer was yes. Machine learning models have successfully identified human-specific (and Drosophila-specific) DNA sequences. Importantly, the functionality of the AI-predicted extreme sequences was validated in Kadonaga’s lab using conventional (wet lab) testing methods.
“Before undertaking this research, we did not know whether an AI model would be ‘intelligent’ enough to predict the activity of 50 million sequences, especially the ‘extreme’ sequences of outliers with unusual activity. . Therefore, it is very impressive and very noteworthy that the AI model was able to predict the activity of a rare extreme sequence of 1 in 1 million,” Kadonaga said, adding that the equivalent 100 million He added that it is inherently impossible to conduct wet-lab experiments of the machine learning technology analyzed because each wet-lab experiment takes nearly three weeks to complete.
The rare sequences identified by machine learning systems serve as successful demonstrations and set the stage for other uses of machine learning and other AI technologies in biology.
“In daily life, people are finding new applications for AI tools such as ChatGPT. Here we demonstrated the use of AI for designing customized DNA elements in gene activation. and biomedical research,” said Professor Kadonaga. “More broadly, biologists are probably just beginning to harness the power of AI technology.”
