Making the data speak

Biostatistician Ying Ma’s work in genomics is shaping personalized treatments for the world’s most challenging diseases.

It started with a love of math.

Drawn to the subject at an early age, Ying Ma remembers falling in love with the clarity and structure of mathematical reasoning and looking for ways to apply it beyond the classroom. At age 13, she discovered math competitions.

“The problems were very different from regular classroom exercises—they required deeper logical thinking and careful reasoning, and often drew on knowledge beyond our curriculum,” Ma said of her experience in academic competitions. “That was when I realized that I truly loved not just math as a subject, but the process of thinking mathematically.”

What Ma didn’t know then, however, was how to take that passion and make a career out of it. At Nankai University in Tianjin, she got her answer. Ma became part of a team conducting a survey investigating the association between women’s oral health during pregnancy and their children’s oral health, which taught her an important lesson.

“I realized just how important data analysis is in studies like this,” Ma said. “Yes, you need to know what the scientific question is, what the public health question is, but you also need to know statistical methods to design the study in the first place, analyze the data, interpret it, visualize it, make decisions from it and make it accessible to inform the public good. That experience was my first step toward developing statistical methods to address a public health or biological question. It was also my first understanding of biostatistics.”

The Interdisciplinary Role of Biostatistics

During her studies and early career, Ma says that her understanding of biostatistics—and her admiration for the role it plays not only in public health but in biology, medicine and engineering—has only deepened.

“It’s a very interdisciplinary field,” said Ma, who joined Brown’s School of Public Health in 2023 as an assistant professor of biostatistics and assistant professor of healthcare communications and technology affiliated with the Center for Computational Molecular Biology. “We collaborate with many different experts, including genetic epidemiologists, clinical doctors, computer scientists, statisticians, engineers and biologists.”

Some biostatisticians focus on teasing apart cause and effect—helping researchers determine whether something truly causes disease or is simply associated with it. Others design clinical trials, calculating how many patients need to be enrolled, how treatments should be tested and how to measure whether a new drug actually works. Some work at the cellular level, analyzing genetic and molecular data to understand how diseases develop, while others help advance precision health, using data to predict how illness and treatments may affect individuals differently.

What they all have in common, Ma says, is simple: “We make the data speak.”

“ Yes, you need to know what the scientific question is, what the public health question is, but you also need to know statistical methods to design the study in the first place, analyze the data, interpret it, visualize it, make decisions from it and make it accessible to inform the public good. ”

Ying Ma Assistant Professor of Biostatistics and Assistant Professor of Healthcare Communications and Technology

Mapping Genes for Personalized Care

Ma works in a subfield known as genomics, which studies genes and how they function inside cells. New next-generation sequencing technologies allow scientists to examine tissues in unprecedented detail, revealing which genes are active in individual cells and generating vast amounts of molecular data. Ma develops statistical and computational methods that make sense of all that information.

She aims to understand why diseases such as cancer or Alzheimer’s progress differently from person to person. Ultimately, she hopes her work will help uncover the biological mechanisms driving those differences so that the knowledge can be used to improve diagnosis, sharpen prediction and, one day, guide more personalized treatment.

“When you collect the data, it doesn’t look like a picture of tissue,” Ma said. “It’s just measurements showing which genes are expressed in each cell. We use statistical methods to reconstruct those numbers into a clear picture of what’s happening inside the tissue.”

For example, in cancer research, scientists want to understand how different cells are arranged inside a tumor. Researchers look for regions dominated by cancer cells and regions that contain immune cells trying to attack the tumor, because that balance can shape how aggressive the tumor is, how it progresses and how it responds to treatment. Those patterns are not visible just by looking at raw data. They have to be uncovered through the type of careful analysis Ma and her team at Brown conduct. 

Once that spatial distribution is identified within a single tissue sample, the next step is comparison with thousands of other samples. Measuring how those patterns differ from person to person helps pinpoint what is driving those differences at the molecular level and can help explain why a disease develops or progresses differently in different patients. Over time, those insights can improve doctors’ ability to anticipate how a patient’s illness is likely to behave.

“That math can help the doctors make better decisions or make better treatment plans,” Ma said.

This type of work is the focus of a recently funded federal project Ma was awarded by the National Institutes of Health. The goal is to link these cellular patterns to real-world outcomes, such as how advanced a cancer is, or how well a patient may respond to a particular therapy.

Advancing Risk Prediction

In recent years, Ma has also become a leader in what’s known as polygenic risk scores, which estimate a person’s inherited risk for certain diseases based on their genetic profile. Ma has helped develop new ways of building these scores and has published widely on how they can be used responsibly and effectively.

In one project, published in The American Journal of Human Genetics, Ma and her collaborators created an online resource that compiles polygenic risk scores for dozens of common health-related traits, from cholesterol levels to cancer risk. The platform allows researchers around the world to access, compare and test these scores in large health databases.

“This allows researchers to better understand genetic contributions to disease risk and through that may help identify individuals who could benefit from earlier monitoring or preventive strategies,” Ma said.

Prioritizing Data Privacy

Still, as genetic data becomes more abundant, so do concerns about how that information is protected. For instance, genetic information can reveal deeply personal details about a person’s health and ancestry, making it especially sensitive. 

Through a National Science Foundation grant, Ma is helping develop statistical and computational tools that allow hospitals and research centers to collaborate without exposing sensitive patient information. Her approach can also help reduce bias and improve how well predictive models work across different communities by enabling institutions to learn from larger and more diverse patient populations.

“There is a long way to go, but we are hoping to make it work because…this can lead to more robust discoveries and models that generalize better across populations while protecting data privacy,” Ma said. It also reminds Ma that the entire subfield she’s in—mapping where genes and molecules are located inside tissues—is still a relatively new area of study.

Even so, she is encouraged by the rapid technological advances of the past two decades and our growing ability to analyze information at a scale that once seemed impossible. This, Ma says, applies not just to genomics but to all of biostatistics, including at Brown. It is all part of an effort, as Ma put it, to “make the data speak.”