Machine Learning and Microscopy | MIT News

Machine Learning


Recent advances in diagnostic imaging, genomics, and other technologies have meant that the life sciences are awash in data. For example, if a biologist wants to study cells taken from brain tissue from an Alzheimer's patient, there could be countless properties they want to investigate, such as the type of cell, the genes they express, or their location within the tissue. But while cells can now be experimentally interrogated using many different types of measurements simultaneously, when it comes to analyzing the data, scientists can typically only work with one type of measurement at a time.

Dealing with so-called “multimodal” data requires new computational tools, and this is where Xinyi Zhang comes in.

A fourth-year PhD student at MIT, Zhang is working to bridge the gap between machine learning and biology to understand fundamental biological principles, particularly in areas where traditional methods have reached their limits. Working in the lab of Professor Caroline Uhler in MIT's Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, and collaborating with researchers at the Broad Institute's Eric and Wendy Schmidt Center and elsewhere, Zhang has led multiple efforts to build computational frameworks and principles for understanding cellular regulatory mechanisms.

“All these are small steps towards the ultimate goal of understanding how cells work, how tissues and organs function, why we get sick, and why some diseases can be cured and some cannot,” Zhang says.

Chan's free time activities are similarly ambitious: The list of hobbies she picked up at the lab includes sailing, skiing, ice skating, rock climbing, performing in the MIT concert choir, and flying single-engine planes. (She received her pilot's license in November 2022.)

“I guess I just like going places I've never been and doing things I've never done before,” she says with typical understatement.

Her supervisor, Wooler, said Zhang's quiet humility strikes a chord “in every conversation.”

“Every time we learn something, it's like, 'Okay, she's learning to fly,'” Wooler said. “It's just amazing. Everything she does, she does for the right reasons. She wants to be good at what she cares about, so it's really exciting.”

Zhang first became interested in biology when he was a high school student in Hangzhou, China, and liked the fact that his biology teacher would not answer his questions, which led him to consider biology the “most interesting” subject to study.

Her interest in biology eventually turned to bioengineering, and her parents, both middle school teachers, encouraged her to study in the US, so she studied bioengineering as well as electrical engineering and computer science as an undergraduate at the University of California, Berkeley.

Zhang was set to immediately enroll in MIT's EECS PhD program after graduating in 2020, but the COVID-19 pandemic delayed her first year. Nevertheless, in December 2022, Zhang, Wooller, and two other co-authors Nature Communications.

The paper was laid down by co-author Xiao Wang, who previously worked with the Broad Institute to develop spatial cell analysis methods that combine multiple forms of cell imaging and gene expression on the same cells, and then map the location within the tissue sample from which they originated, something that has never been done before.

This innovation had many potential applications, including enabling new ways to track the progression of various diseases, but there was no way to analyze all of the multimodal data this would generate. That's where Zhang came in, interested in designing a computational method that could do just that.

The team focused on choosing chromatin staining as their imaging technique because it is relatively inexpensive yet reveals a lot of information about the cell. The next step was to integrate the spatial analysis techniques developed by Wang, for which Zhang began designing an autoencoder.

An autoencoder is a type of neural network that typically encodes and reduces large amounts of high-dimensional data, then transforms the transformed data back to its original size. In this case, Zhang's autoencoder did the opposite, taking the input data and making it higher dimensional. This allowed him to combine data from different animals, eliminating technical variations that aren't due to meaningful biological differences.

The paper used the technique, abbreviated as STACI, to identify how cells and tissues show the progression of Alzheimer's disease when viewed with different spatial and imaging techniques. The model could also be used to analyze a range of diseases, Zhang says.

If she had infinite time and resources, her dream would be to create a perfect model of human life. Unfortunately, time and resources are limited. But her ambitions are not, and she says she wants to continue using her skills to solve “the hardest problems that we don't have the tools to answer.”

She is currently working on completing several projects, one focused on studying neurodegeneration through imaging of the frontal cortex, and another project predicting protein images from protein sequences and chromatin images.

“There are a lot of questions that remain unanswered,” she says, “and I want to pick questions that make biological sense, questions that will help us understand things we didn't know before.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *