Visitors view a digital representation of the human genome in the United States on August 15, 2001. … [+]
Artificial intelligence (AI) has many uses, most of which are undetectable. From processing data from items scanned at self-checkout kiosks to detecting soil composition for growing crops, AI is being used in an ever-widening range of applications.
Funding for AI in healthcare has tripled: In 2021, McKinsey reported that biotech companies raised more than $34 billion worldwide. In Q1 2024, just 20 healthcare-focused biotech companies raised a total of $2.9 billion.
In addition, in the first quarter of 2024, Moonwalk Biosciences raised $57 million in funding for epigenetic profiling, the study of how environment and behavior alter how genes work without changing the DNA sequence.
In 2003, the Human Genome Project completed the work of generating and deciphering the first sequence of the human genome, which began in 1990. This work has created fundamental information about the human blueprint, paving the way for personalized medicine, accelerated research into human biology, and medical discoveries related to neurodegenerative disorders, cancer, and heart disease.
“We currently understand the function of 1 percent of the human genome. The other 99 percent exists, but we don't understand its function,” said Dr. Jennifer Hintje, CEO and co-founder of PherDal Fertility Science. “It was originally called 'junk DNA,' but it's far from junk.”
“There must be a biological reason why it has persisted throughout human evolution that we don't yet understand,” Hintje said. “We've sequenced just 1% of the human genome and we've already eradicated diseases that would have killed us before.”
“If we could understand the purpose of the remaining 99% of the human genome, how many other diseases are waiting to be eradicated? It would help transform medicine, and most importantly, every single percent of the knowledge we gain from our own DNA could save lives,” Hinche added.
artificial intelligence
Genialis is an RNA biomarker company that has raised $13 million to date. The company uses machine learning and AI to look at the biology that underlies disease. Genialis CEO and co-founder Rafael Rosengarten says AI, and machine learning in particular, is good at finding patterns in the massive amounts of data that are essential for genomic interpretation.
“Each human genome is roughly 6 billion base pairs (3 billion long x two copies of each chromosome),” Rosengarten says. “Millions of genomic changes (mutations) occur in our bodies throughout our lives, resulting in millions of variations between people. And because in many cases it's not a single change that's the cause, but rather a combination of changes, this is a data space that is highly ripe for the power of AI.”
The challenge of sequencing the genome, Rosengarten says, is to understand what changes make sense: “That means changes that happen in our bodies, and maybe trying to figure out which of those are pathogenic and which are druggable.”
“This could mean differences in genomic sequences between groups or individuals within a population, and we're trying to figure out which of these genomic differences are causing the variants that we're seeing, particularly as they relate to healthspan, longevity, drug responses and other medical issues,” Rosengarten added.
Genome division
AI algorithms can fill gaps in access to genetic medicine
The Genialis team analyzes large genomic datasets to uncover patterns of disease prevalence and genetic predisposition in underserved populations.
Each dataset ranges from small, with only a few dozen samples, to large, with thousands of patient samples. “The datasets themselves range from those generated by microarray technologies in the late 2000s to those generated by a variety of sequencing technologies since the early 2010s,” Rosengarten added.
“Not everyone's genomic data is exactly the same, and there are significant differences between ethnic groups and between men and women,” Rosengarten said. “There is no universal genomic database, but 60 to 80 percent of patient data sets worldwide come from people of European ancestry.”
Rosengarten says this leads to major problems when developing treatments for populations where they are trained on data from people from different backgrounds, such as rural India.
Rosengarten cites the work of Caroline Criado Pérez. The invisible womanpoint out that medical researchers have historically tried to avoid female subjects whenever possible because of biological heterogeneity related, at least in part, to menstrual and estrous cycles.
“Misguided attempts to reduce the complexity of experimental or clinical designs often intentionally fail to take into account the variability relevant to the scientific question or medical need at hand,” Rosengarten said.
Rosengarten says that while since 2014 the National Institutes of Health has required grant applicants to include plans to achieve gender parity among preclinical experimental models, it's worth noting that an organism's sex is not always apparent from the metadata associated with the data in public or proprietary databases.
Genialis says it is working with institutions around the world, including in Qatar, India and across Asia more broadly, to combat this disparity by creating the world's most ethnically and geographically diverse cancer dataset and finding solutions that work for targeted patients.
“Genialis carefully sources datasets from its global network of clinical partners to ensure the data used to train and validate its biomarker algorithms is comprehensive and reflective of the entire population it treats,” Rosengarten said. “This intent is critical, because relying on data that represents too narrow a range of populations leaves AI and machine learning algorithms with too many gaps and biases to learn from.”
Understanding genomic diversity
“The first genome analyses were done on just one person. Just one person,” Hintje says. “For a long time, most clinical studies of cancer and other diseases were done by comparing tumors to the small data sets that we had access to, which included the genomes of just a handful of people, which lacked genetic diversity.”
Hintje says we need to consider how underestimated these genomes are.
“Most clinical studies have focused on comparing DNA to the DNA of just a handful of people,” Hintzsche says. “If you look at tumor DNA, for example, and compare it to this small dataset of DNA from just a few people, how do you know whether a mutation in the DNA is causing the cancer or just something that's common in people of a particular background? In most cases, you don't know.”
“We have very little understanding of the genomic diversity that could explain the many different reasons why people respond differently to treatments and diseases, all of which is hidden deep within our DNA that we have not yet begun to understand.”
