Fall 2019
THIS ISSUE

Using Data Science Methods to Answer Big Health Questions

article summary

UNC Gillings faculty engage in interdisciplinary collaboration and data-driven precision health approaches to tackle complex public health issues.

Gillings faculty thrive on tackling complex public health issues. We’re Gillings. We’re on it! Powerful technologies and innovative methods fuel their work, while UNC’s uniquely collaborative environment empowers them to reach across disciplines for answers to some of the most pressing public health challenges of our time. 

Using Big Data to Solve Big Public Health Problems: Gillings faculty thrive on tackling complex public health issues. We’re Gillings. We’re on it! Powerful technologies and innovative methods fuel their work, while UNC’s uniquely collaborative environment empowers them to reach across disciplines for answers to some of the most pressing public health challenges of our time. 

Successful data science hinges on the interplay between the questions that researchers want to ask, and the methods they use to find the answers. Researchers at the UNC Gillings School of Global Public Health are right in the middle of that intersection, working together to improve health and health-care outcomes locally and globally.

“We like to collaborate,” says Michael Kosorok, PhD, W.R. Kenan, Jr. Distinguished Professor and chair of biostatistics. “We often team up with clinicians and other biomedical researchers who have problems they want to solve and figure out how to open doors to get those answers — and sometimes, the methods we choose can help refine those questions or change those goals. It’s all in the interaction.”

Data science approaches like causal inference and machine learning are used increasingly in precision health, which aims to provide personalized solutions to public health problems. Precision health works in three different stages of increasing complexity:

  • Prediction: Capturing information and characteristics of patients
  • Causal Inference: A what-if analysis that estimates what will happen if an action or treatment is changed
  • Decision Support: The development of computer algorithms to optimize actions or interventions to maximize health outcomes

“Think of those stages as what is, what might be, and how best to act to achieve our goals,” said Stephen Cole, PhD, professor of epidemiology. “We want the algorithms we develop for precision health to account for both the individual and the context so we can figure out better prevention and treatment strategies.”

Cole and Kosorok are working with Jeff Stringer, MD, professor of medicine, on the Limiting Adverse Birth Outcomes in Resource-Limited Settings (LABOR) study, a project funded by the Bill & Melinda Gates Foundation to evaluate 15,000 pregnant women in two or three developing countries. 

The mothers in the LABOR study will wear patches on their abdomens — which are being developed specifically for this study — that will record oxygen saturation levels, heart rates and other real-time information about the women and their babies, since the patches are designed to discern signals from the baby and from the mother during the labor and birthing process. Researchers also will examine the mothers’ medical records and structural information about the clinics themselves, such as the actions of staff and events over time.

Using all these data sources, researchers will develop new algorithms and precision medicine tools that will help doctors better assess an individual woman’s risk of having adverse pregnancy outcomes or having a baby at risk for poor birth outcomes, and predict the health interventions they will likely need.

Though they teamed up for the LABOR study, Kosorok and Cole do not often get the opportunity to work together. Cole has focused much of his career on study designs about population health risks and infectious disease, while Kosorok’s application of data science to health problems has centered primarily on cancer and diabetes.

One of Kosorok’s recent priorities is using precision medicine to improve Type 1 diabetes treatments. He is working with Elizabeth Mayer-Davis, PhD, RD, the Cary C. Boshamer Distinguished Professor and chair of nutrition, and Eric B. Laber, PhD, professor of statistics at North Carolina State University, on artificial intelligence tools that allow researchers to analyze each patient and determine optimal treatments in real time. They’ve developed a mobile app prototype integrating an insulin pump, a glucose patch, and activity monitors to help diabetes patients manage their glucose levels. 

For Cole, HIV and other infectious diseases has been at the heart of his work for several years. He’s currently involved in optimizing HIV treatment and exploring treatment as prevention for HIV infections. Cole recently teamed with Ada Adimora, MD, MPH, professor of epidemiology at Gillings and Sarah Graham Kenan Distinguished Professor of medicine, and others to develop new methods to project the benefit of HIV treatment as prevention among U.S. women, where a randomized clinical trial seems infeasible.

“Humans are complex, but they are also precious, so we have to get it right.”

Michael Kosorok, PhD
W.R. Kenan, Jr. Distinguished Professor and Chair of Biostatistics

Despite their distinct research interests, Cole and Kosorok occasionally walk around campus to bounce research ideas off each other and engage in discussions about science, learning and life. Kosorok is an accomplished music composer who originally planned biostatistics as a backup career, while Cole is an avid student of philosophy and history who is driven by learning.

“Reading various scholars and works in the historical record gives me context for what I’m doing now,” Cole says. “My project is to learn how to learn better.”

One of their shared philosophies is that although they delve deeply into math, machine learning and methods, their work is human-focused. “Being in biomedicine and working with real patients causes us to be really careful with the methods we use,” Kosorok says. “Humans are complex, but they are also precious, so we have to get it right.”

Data Science Basics

  • Causal Inference is a set of new approaches to address the age-old problem of induction. The problem of induction is the problem of how to justifiably infer causal relationships from observations. For example, will HIV-related mortality differ under plan A, compared to plan B? A key aspect of modern causal inference is the use of potential, or counterfactual, outcomes, as well as observed factual outcomes.
  • Machine Learning is a set of analytical methods from computer science and statistics which analyze data to produce predictions and to support decisions (e.g., given all of my medical data, which therapy should my doctor choose for me).

More from this issue

See all articles from this issue