Developing machine learning models to investigate DNA methylation

Machine Learning


This article has been reviewed in accordance with Science X's editorial processes and policies. The editors have highlighted the following attributes while ensuring the authenticity of the content:

fact confirmed

Peer-reviewed publications

trusted sources

proofread


DNA methylation and tissue of origin estimation from cfDNA ULP-WGS. credit: nature communications (2024). DOI: 10.1038/s41467-024-47196-6

× close


DNA methylation and tissue of origin estimation from cfDNA ULP-WGS. credit: nature communications (2024). DOI: 10.1038/s41467-024-47196-6

A Northwestern Medicine study details the development of a machine learning model to predict the DNA methylation status of cell-free DNA by fragmentation patterns. nature communications.

DNA methylation, a biological process in which methyl groups are added to DNA molecules, acts as an “off switch” for certain genes, but is commonly malfunctioned in diseases such as cancer.

Cell-free DNA (small amounts of DNA left behind by various cellular processes) can be measured by the current gold standard, whole-genome bisulfite sequencing, but this process is imperfect, so It can damage DNA, limiting scientists' ability to do research.

“Cell-free DNA refers to these short DNA fragments. When a cell dies, it releases the DNA into the blood,” said Yaping Liu, assistant professor of biochemistry and molecular genetics and co-investigator. the doctor said. – Corresponding author of the study. “This cell-free DNA outside the cell represents a signature of cell death.”

Unlike normal DNA, cell-free DNA degrades in specific patterns and is highly correlated with epigenetic status, so Liu used cell-free DNA fragmentation patterns to determine DNA methylation. He said he began to wonder if it was possible to predict the levels.

In this study, Liu and his collaborators used characteristics of circulating cell-free DNA fragments to train an unsupervised machine learning model that analyzes small sections of DNA called CpG sites.

The researchers then used the model to analyze human blood samples taken from healthy patients and patients with different types of cancer, and compared the model's accuracy by analyzing individual whole-genome samples for the samples. Sequencing was performed.

The model more accurately predicted DNA methylation status, primarily in CpG-rich regions of the genome, compared to traditional sequencing, according to the study.

“Clinicians are already generating large amounts of cell-free DNA genome sequence data using currently available tests,” said Liu. “With our model, we can further leverage that data to predict changes that occur in DNA methylation and genes.”

The model can also accurately predict which tissues cell-free DNA comes from, which could pinpoint the origin of aberrant methylation signatures that occur in various cancers, Liu said. .

Liu's lab will continue to develop computational methods to better understand gene regulatory information from cell-free DNA fragments, he said.

“Our goal is to use the epigenetic information hidden in cell-free DNA to understand non-coding regions of the human genome,” said Robert H. Lurie, a member of Northwestern University's Robert H. Lurie Comprehensive Cancer Center. One Mr. Liu said. “We want to not only detect the disease early, but also have the opportunity to understand what's happening to the genome at that point.”

For more information:
Yaping Liu et al, FinaleMe: Prediction of DNA methylation by fragmentation patterns of plasma-free DNA, nature communications (2024). DOI: 10.1038/s41467-024-47196-6

Magazine information:
nature communications



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *