A new computational tool called MARRVEL-MCP helps researchers approach genetic diagnosis more efficiently by using everyday language to analyze and interpret vast amounts of genetic and biological information. The study, conducted by researchers at Baylor College of Medicine and Texas Children’s Hospital, American Journal of Human Genetics.
“Rare genetic diseases are often caused by small changes in a person’s DNA. However, not all genetic changes associated with a condition are responsible for the disease,” said co-author Hyun-hwan Chung, Ph.D., assistant professor of pediatric neurology at Baylor University and a research associate at the Duncan Neurological Institute at Texas Children’s Hospital. “Some changes may contribute to the disease, while others may not. Identifying whether a particular genetic change or mutation is deleterious or an innocent bystander is important for diagnosing these conditions, but the process requires sifting through large amounts of data, making it a complex and time-consuming task.”
“To arrive at a genetic diagnosis, doctors and researchers must gather information from many different biological databases, each with their own formats and rules, and carefully piece together the evidence, which can take many hours for a single case, even for experts,” said co-author Dr. Zhandong Liu, associate professor of pediatric neurology at Baylor University. Mr. Liu is also the head of the computational science department at Texas Children’s.
The current study introduces MARRVEL-MCP, a new computational tool designed to make this process faster and more accessible, especially for non-experts. It combines artificial intelligence, especially large-scale language models (LLMs) such as ChatGPT and Gemini, with a set of structured biological databases to help interpret genetic variation using common terms.
MARRVEL to MARRVEL-MCP
The team previously developed MARRVEL (Modelorganism Aggregated Resources for Rare Variant ExpLoration), a computational approach that allows researchers to comb through large genetic and biological databases at once in minutes to search for information about genetic variations. MARRVEL has been well received by the scientific community, recording more than 43,000 users worldwide in 2025 alone.
MARRVEL combines genomic, functional, and model organism databases into an integrated platform. These sources contain different types of information that need to be considered to determine whether a genetic variation is the cause of a disease. For example, predictions about how common a variant is in a population, whether it has been previously associated with a disease, whether it causes damage to genes, information from laboratory experiments or model organisms, and scientific papers discussing similar cases.
“However, MARRVEL requires precisely formatted input and produces comprehensive but complex output that requires considerable manual interpretation,” says Jeong. “This is a barrier that limits accessibility and efficiency for many users, as it assumes that they can interpret disparate output and integrate evidence across sections, which requires considerable expertise.”
The MARRVEL Model Context Protocol (MCP) changes the way this process works. Rather than having to learn technical formalities or manually navigate databases, users will be able to ask questions in plain language, such as “Is this BRCA1 mutation associated with cancer?”
In just seconds, MARRVEL-MCP automatically identifies important information (such as gene names and mutations), transforms them into the format required by the database, queries multiple data sources in the correct order, and combines the results to generate clear, evidence-based answers. MARRVEL-MCP covers areas such as disease association, genetic variation, gene expression, and scientific literature, and enables LLMs to autonomously create and execute multi-step analysis workflows from simple linguistic queries.
“What I’m most excited about is that MARRVEL-MCP shows that you don’t necessarily need the biggest frontier AI model to make meaningful progress in biomedical research,” said Jeong. “Providing small models with access to well-curated tools and structured contexts can make them smarter for specialized tasks. For example, gpt-oss-20b, a locally installable model, increases accuracy from 41% without MARRVEL-MCP to 94% with MARRVEL-MCP. This suggests a path to more accessible and cost-effective AI for rare disease research.”
“We released MARRVEL-MCP as an open resource that enables the integration of LLM agents with selected biomedical databases,” said Liu. “To facilitate independent exploration and reproducibility, we provide access to MARRVEL-MCP through a hosted interface published at https://chat.marrvel.org, allowing users to interactively test the system without having to install it locally. We also add agent AI capabilities to the main MARRVEL We’re revamping the platform so it can perform independent actions beyond just generating text and responding to prompts, allowing users to easily move from plain language questions to structured genetic analysis.”
reference: Everton Z, Botas J, Kim SY, Yao L, Liu Z, Chung HH. MARRVEL-MCP: An agent interface for Mendelian disease discovery through tool-enhanced context engineering. Am J Hum Junet. 2026.doi: 10.1016/j.ajhg.2026.04.012
This article has been reprinted from the following material: Note: Materials may be edited for length and content. Please contact the citation source for details. You can access our press release publishing policy here.
