
The molecular machines, chromatin remodelers (pink and green, left) and RNA polymerase II (grey, yellow, and blue, center) work together to read the genomic information stored in the densely packed DNA (white coils). Credit: Farnung Lab
For Lukas Fernung, nothing is more fascinating than how a single fertilized egg develops into a fully functioning human being. As a structural biologist, he studies this process at the smallest scale: trillions of atoms needing to work in sync to make it happen.
“I don't see a big difference between solving a 5,000-piece jigsaw puzzle and the work we're doing in our lab,” says Farnung, an assistant professor of cell biology in the Blavatnik Institute at Harvard Medical School. “We're trying to understand what this process looks like visually and from there form an idea of how it works.”
Nearly all cells in the human body contain the same genetic material, but what tissue type those cells become during development — whether they become liver or skin, for example — is determined primarily by gene expression, which determines which genes are turned on and off. Gene expression is controlled by a process called transcription, and this is the focus of Farnung's research.
During transcription, molecular machines read the instructions contained in the genetic blueprint stored in DNA and create RNA, the molecule that carries out the instructions. Other molecular machines read the RNA and use this information to create proteins that fuel nearly every activity in the body.
Farnung studies the structure and function of the molecular machines responsible for transcription.
In a conversation with Harvard Medicine News, Farnung spoke about his work and how machine learning is accelerating research in his field.
What is the central question your research is trying to answer?
I always say that we're interested in the smallest logistical problems. The human genome is present in almost every cell, and if you stretched out the DNA that makes up the genome, it would be roughly two meters long, or six and a half feet long. But this two-meter-long molecule has to fit inside a cell nucleus, which is just a few microns in size. That's like trying to fit a fishing line that stretches from Boston to New Haven, Connecticut, about 150 miles, into a football.
To achieve this, cells compact their DNA into a structure called chromatin, but in doing so, the genomic information on the DNA is inaccessible to molecular machines. This creates a paradox: DNA needs to be compact enough to fit inside the cell's nucleus, but at the same time, molecular machines need to be able to access the genomic information on the DNA. We are particularly interested in visualizing the process by which a molecular machine called RNA polymerase II accesses the genomic information and transcribes DNA into RNA.
What techniques do you use to visualize molecular machines?
Our general approach is to isolate the molecular machines from the cells and observe them using certain types of microscopes or X-ray beams. To do this, we introduce the genetic material that codes for the human molecular machine of interest into insect or bacterial cells, and then we encourage the cells to produce large amounts of that machine. We then use purification techniques to isolate the machine from the cell so that we can study it in isolation.
But this gets complicated because in many cases we are not just interested in a single molecular machine, also called a protein: there are thousands of different proteins that interact to control transcription, so to understand the interactions between these proteins we have to repeat this process thousands of times.
Artificial intelligence is beginning to permeate many aspects of basic biology. Is this changing the way structural biology is researched?
For the past 30-40 years, research in my field has been a tedious process: a PhD student's career is dedicated to learning a little about a single protein, and it takes the careers of thousands of students to learn how proteins interact in cells. But in the last 2-3 years, we have increasingly turned to computational approaches to predict protein interactions.
A major breakthrough was Google DeepMind's release of AlphaFold, a machine learning model that can predict protein folding. Importantly, how a protein folds determines its function and interactions. We are now using artificial intelligence to predict tens of thousands of protein-protein interactions, many of which have never been described experimentally before. Not all of these interactions actually occur in cells, but they can be validated in laboratory experiments.
This is super exciting because it really accelerates science. Looking back at my PhD, the first three years were basically failures; I didn't find any protein-protein interactions. Now, with these computational predictions, PhD students and postdocs in my lab can be confident that their experiments validating protein-protein interactions will be successful. I call this molecular biology on steroids, and it's legit, because it gets you to the actual questions you want to answer much faster.
Besides efficiency and speed, how is AI transforming your field?
One breakthrough is that we can now compare every protein in the human body impartially with every other protein to see if they might interact. Machine learning tools in our field are disrupting society in a way similar to the way personal computers have disrupted it.
When I started as a researcher, X-ray crystallography was used to define the structure of individual proteins, a powerful high-resolution technique that could take years. Then, during my PhD and postdoc years, cryo-electron microscopy (cryo-EM) came along, a technique that allowed us to see larger, more dynamic protein complexes at high resolution. Cryo-EM has made major advances in our understanding of biology over the past decade and accelerated drug development.
I was lucky to be part of the so-called resolution revolution brought about by cryo-EM, but now it feels like machine learning for protein prediction is bringing about a second revolution, which is amazing to me and makes me wonder how much more acceleration we're going to see.
My guess is that we can do research 5 to 10 times faster today than we could 10 years ago. It will be interesting to see how machine learning changes the way we do biological research in the next 10 years. Of course, we need to be careful about how we manage these tools, but it's exciting to be able to make discoveries 10 times faster on problems we've been thinking about for a long time.
What are the downstream applications of your research outside the lab?
We are learning the basic workings of biology in the human body, but there is always the possibility that understanding the basic biological mechanisms can help us develop effective treatments for various diseases. For example, the disruption of DNA chromatin structure by molecular machines has been found to be one of the main causes of many cancers. If we know the structure of these molecular machines, we can understand the effect of changing a few atoms to recreate the mutations that lead to cancer, at which point we can start designing drugs to target proteins.
We have started a project in collaboration with the HMS Therapeutics Initiative to study chromatin remodeler, a protein that is highly mutated in prostate cancer. We recently obtained the structure of this protein and are conducting virtual screening to see which compounds bind to it. We hope to be able to design compounds that could be developed into full-fledged drugs that could inhibit this protein and slow the progression of prostate cancer.
We also study proteins implicated in neurodevelopmental disorders such as autism, and the tools we use to predict protein structure and protein-protein interactions can also predict how small molecule compounds will bind to proteins, so this is an area where machine learning can help.
Speaking of collaboration, how important is collaboration across disciplines and specialties to your research?
Collaboration is extremely important for my research. The world of biology is so complicated with so many different research fields that it's impossible to understand it all. Collaboration allows people with different expertise to come together to tackle important biological problems, such as how molecular machines can access the human genome.
At HMS, we collaborate with other researchers on many levels. Sometimes we use our structural expertise to support the research of other laboratories. Other times, we may have solved the structure of a particular protein but need collaboration to understand the role of that protein in the broader cellular environment. We also collaborate with laboratories that employ other types of molecular biology approaches. Collaborations are crucial to facilitating progress and better understanding biology.
Courtesy of Harvard Medical School
Quote: Q&A: How Machine Learning is Propelling Structural Biology (July 22, 2024) Retrieved July 22, 2024 from https://phys.org/news/2024-07-qa-machine-propelling-biology.html
This document is subject to copyright. It may not be reproduced without written permission, except for fair dealing for the purposes of personal study or research. The content is provided for informational purposes only.