Behind the Lectern: Joseph Hogan

In his nearly thirty years at Brown University, Professor Joseph Hogan has witnessed not just a revolution in the fields of biostatistics and HIV research, but a transformation at Brown. In this interview, he traces the young history of biostats at the University and explains how the field helps researchers deliver results that are rigorous and reproducible.

Reaching Higher

 

Using statistics to help solve global health challenges.

Going back to the start of your career, tell us a little bit about how you came to the field.

I was a high school teacher in Los Angeles in the late 1980s. I graduated from UConn with a math degree, and I was pretty certain I wanted to teach high school. I think I also was a bit restless intellectually, so I entered a graduate degree program in applied math. The goal there was to just keep my teaching credential valid. And when I got into applied math, I learned that there was a lot of theory involved.

I think the path I took is a path that's followed by a lot of statisticians. We're pursuing a career in math or something like that, but at some point, the theory kind of ran its course for me. I really was hungry for something a little more concrete to grab onto, some way to make a difference.

I'm a statistician, that's my career, but I never took a statistics course until I was in graduate school. And when I did, it was like opening a treasure chest. I had no idea that math could be used in such concrete ways. I was taking that class and getting really excited about it. I was thinking really as a teacher: ‘There's so many examples here I can bring back to the classroom,’ but I was getting into the material intellectually as well. I had a great adviser encouraging me.

At the same time, in Los Angeles in the late ‘80s, we were almost at the epicenter of the HIV epidemic in the United States. It was exacting an enormous toll on the population in LA. It was all over the newspapers at the time; it was hard to escape. It wasn't quite like COVID as a public health emergency. It was like a slow moving public health emergency, but it was taking on a lot of urgency.

And around that time, I had an adviser who convinced me that I should consider doing more graduate work, and that's when I learned about biostatistics. Once I figured out that biostatistics is using statistics to solve health problems, I couldn't think of anything more interesting or more motivating to study. And as I said, with AIDS as a backdrop, I felt like this was a place where people could really make a difference.

How did biostatisticians make a difference in combating the AIDS epidemic?

At that time, statisticians, biostatisticians in particular, were making enormous contributions to advancing HIV treatment and HIV therapy. In the early days of HIV clinical trials, the primary endpoint for testing out a new drug was mortality. It's crazy for us to think about now, but that was the end point. So you'd randomize people to receive a drug or not and then follow for enough time until you saw enough events to differentiate between whether one drug was working better. And of course, this was becoming intolerable.

The community of AIDS activists was also pushing for the FDA and the government to have more fast track approval of these medications. So statisticians came up with different designs in which we could make decisions before mortality events start showing up.

There was a group of statisticians at Harvard’s School of Public Health, and to some extent at the University of Washington, who came up with this notion of a ‘surrogate marker.’ It was very controversial at the time, but the idea is, there would be a clinical endpoint, in this case it was CD4 count, which characterizes the functioning of your immune system, and the trials were then evaluated based on CD4 count instead of mortality.

We all knew that as the immune system depleted, mortality became more likely, but we also knew that we couldn't wait around for deaths in order to approve drugs. So designing trials based on a surrogate marker—which at the time was very controversial—showing that it would allow an equivalent, almost exchangeable type of result, one that was as equally valid as if you had followed people on to mortality, this was really the work of biostatisticians. And it helped make the case to agencies like the FDA, and to make the case to journals who were publishing articles on new drugs.

It saved lives.

It surely did. Yeah, it surely did save lives. I mean, there were many fewer choices then than there are now, but putting new antiviral agents into the hands of doctors and patients certainly saved lives because we know now that that's the way you stay alive if you have HIV. If you take an effective antiviral, HIV becomes a chronic disease.

How did that lead you to collaboration with your partners in Africa around HIV prevention and treatment?

I became interested in HIV when I was in grad school because I was at Harvard’s School of Public Health which was handling all the data analysis for AIDS clinical trials at the time. I've always thought of HIV as a really compelling public health problem to be working on.

The work I was doing at Brown was primarily domestic, and a couple of grants I was on were running out when a colleague of mine, Jane Carter, who is an infectious disease doc over at the Miriam Hospital involved in the Brown-Kenya collaboration, came to me and she said she’d identified a student who wanted to get her Ph.D. at Brown, 'but, you know, she lives in Kenya and, and what do you think? Do you think she can make it into the Brown program?' And, to be honest, I knew very little about anything outside of the U.S. with respect to HIV. 

At the time, HIV had become, I don't want to say a solved problem, but it had become a manageable problem in the U.S. and in Western Europe. Meanwhile, in Sub Saharan Africa, it was exerting—this is in the late 90s, early 2000s—what would become a catastrophic toll. For example, the prevalence of HIV in Botswana was somewhere between 30% and 40%.

Botswana was losing almost an entire generation of young men to HIV. And the reason was because there was lots of migrant work in Botswana. There's a lot of mining, so you have young men traveling back and forth, but they're not transmitting typical STDs, they're bringing HIV back to their communities.

woman and man pose for photo
Professor Hogan with Professor Ann Mwangi PhD'11 of Moi University

In the early 2000s, the U.S. implemented what was potentially the most effective public health intervention of the century, which is PEPFAR, the President's Emergency Plan for AIDS Relief. It was a huge investment on the part of the U.S. to provide antiviral medication and the infrastructure for delivering it to many places throughout the world, in particular, Sub Saharan Africa. So now, fast forward to 2007.

When Jane Carter approached me, a Kenyan woman, Ann Mwangi, was working as a statistician in a program called AMPATH, the Academic Model Providing Access to Healthcare. AMPATH was one of those sites that was rolling out and building infrastructure for treatment delivery. And Ann was the only statistician working in this huge operation. 

We admitted her to Brown's biostatistics doctoral program. She showed up two weeks late to the semester because of visa issues, but Ann got her Ph.D. at Brown and today, she's my counterpart in Kenya. She and I are partners in building biostatics capacity at Moi University in the AMPATH program.

Fast forward to 2023, AMPATH has tens of millions of dollars of NIH funding. We have built, over time, a team of about 10 to 12 biostatisticians, many of them trained at Brown. It's been incredibly gratifying.

“ Anne taught me that the problems I was approaching, the problems I was working on, had to take on urgency. If this person was willing to make that kind of sacrifice—and she was looking to me for mentorship, advising—I felt that a switch went off in me. I felt much greater responsibility. And I've learned a lot in working with Kenyans about the immediacy of the problems they're facing and the importance of really pouring yourself into it. ”

Joseph Hogan Carole and Lawrence Sirovich Professor of Public Health, Professor of Biostatistics, Chair of Biostatistics

When Ann arrived, I only found out about a week after she had arrived that she had an infant daughter at home in Africa. At first, it was hard to know how to react to that exactly, but after processing it a little bit, I realized the responsibility that I had: Ann had decided to make this kind of investment where she was going to leave her young daughter and husband behind to come to Brown, make an investment in her future and the future of her country—quite literally. I mean, this is how she articulated it to me. And it was a seismic shift in my own career.

Ann taught me that the problems I was approaching, the problems I was working on, had to take on urgency. If this person was willing to make that kind of sacrifice—and she was looking to me for mentorship, advising—I felt that a switch went off in me. I felt much greater responsibility. And I've learned a lot in working with Kenyans about the immediacy of the problems they're facing and the importance of really pouring yourself into it.

A lot of people might make great biostatisticians, but don't have an introduction to the field. Could you give us a high overview of biostatistics?

It's the nexus point. Looking back, I should have known I was going to be a statistician because I was the kid who opened up the Sunday newspaper in the summer and studied every baseball statistic that was available. So what is statistics? It's grounded in mathematics for sure.

The backbone of statistics, math, is vast, but the particular field of mathematics that we rely on is probability. It's the mathematics of chance and random occurrence and uncertainty. 

But I also think biostatisticians have a lot in common with engineers and people who like to solve messy problems. Data problems are mostly messy, especially when the data comes from people. The sorts of things that we measure, that doctors measure, to try and see how healthy we are: we want to know, what's normal, what's not, what's an acceptable range, what's not, what's a worrying trend? But trying to sort that out from samples of people is enormously complicated.

So statistics marries the formalism of probability with the messiness of data and tries to find a way to have rigorous explanations for how data arrives in our hands. What are the attendant uncertainties and sources of variation? 

And now, computing is really a central part of doing statistics in the modern world. Having access to computers, you can do things fast. We run simulation experiments, we solve complicated differential equations. We handle, format, integrate and clean up massive, massive swaths of data.

There's math problems you can't solve with pencil and paper, there's equations you can't solve, processes you can't really replicate without writing up some fairly sophisticated pieces of code to implement a method of analysis to get a solution to a problem.

So math and coding are the backbone, and then there has to be a willingness to, you know, jump into the fray: deal with complicated problems. And the combination of those three things makes, I think, a good statistician.

And then you just need good data.

Yeah, that's really the whole thing. Do you have the right data to answer the question you want to answer? Are you doing the right things with those data? This preoccupies statisticians. And there's an ethical imperative here.

All scientists agree that a randomized trial delivers the best grade of evidence. So when we want to know if a new intervention works—Does a new cancer therapy work?—the highest quality evidence is a randomized trial where you control who gets a treatment and who doesn't and nothing else interferes. So you make a direct comparison between the people who got the treatment and the people who got the placebo or some other treatment. The differences are only attributable to the treatment. That's the cleanest kind of statistical inference. And it's the highest grade of evidence in medicine and public health.

Now, data science grew out of the availability of data. Suddenly, data are everywhere and we can access those data. So even going back to when I was a student, getting your hands on high quality data was really, really difficult. It's still difficult to some degree, but now we're talking about analyzing electronic health records of hundreds of thousands or possibly millions of people in a single analysis. We're talking about surveillance data from cancer registries. We’re trying to draw rigorous conclusions from data like that. They're unstructured. They're not collected for the purposes of research. This is enormously difficult, and there is an ethical imperative. For example, in analysis of electronic health records, nobody's randomized to get treatment versus not treatment. They got the treatment their doctor decided to give them.

If you want to evaluate whether that treatment is effective or not, how do you sort out the treatment that wasn't randomized? There could be lots of other things that affect whether a person who gets a treatment turns out better or worse than a person who didn't. Maybe it's the sicker people who are getting this more advanced treatment and just by getting the treatment, you're labeling yourself as someone who's more prone to have advanced disease.

So how to untangle those—we call it “selection bias” in statistics—how to untangle these selection biases that come from the non random allocation of interventions or therapies, missing data, the fact that electronic health records are people who show up for healthcare, not a population. All these things really force us to understand if data are being used properly.

“ I've heard a lot of people say statistics is just p values. But all the things I described—Do you have the right data? Are the data answering your question? Have you handled all the biases as rigorously as possible?—that's statistics. A p value is like the sprinkles on the cake. It's really not the heart of what we do. ”

Joseph Hogan Carole and Lawrence Sirovich Professor of Public Health, Professor of Biostatistics, Chair of Biostatistics

Are they being used fairly? Are the analyses delivering answers to the questions you're asking? This doesn't sound like statistics. I've heard a lot of people say statistics is just p values. But all the things I described—Do you have the right data? Are the data answering your question? Have you handled all the biases as rigorously as possible?—that's statistics. A p value is like the sprinkles on the cake. It's really not the heart of what we do.

So you take data from the wild, and help try to pull from it what you might be able to understand from the results of a carefully controlled randomized clinical trial.

Right, in a nicely controlled randomized experiment, it's like putting together the pieces of a puzzle. The pieces fit together nicely. With data in the wild, it's like somebody dumped the puzzle box on the floor. The researcher, without a statistician, usually doesn't know what pieces are missing, or where to look for them. So it's really knowing where the weaknesses are, knowing where the missing pieces are. That's what statisticians are really good at.

You are one of the original members of Brown’s Department of Biostatistics. Tell us about that history.

Biostatistics at Brown got its start in 1994. Constantine Gatsonis was recruited from Harvard to start the Center for Statistical Sciences, and I was one of his first hires. Among that first cohort, there were three of us, and I'm the last one standing, but I feel really fortunate that he saw something in me to recruit me to Brown.

“ In the span of 25 or so years, under Constantine's leadership, we've put together a department that is now like, knocking at the door of the top 10. We're incredibly proud of that. ”

Joseph Hogan Carole and Lawrence Sirovich Professor of Public Health, Professor of Biostatistics, Chair of Biostatistics

He was a pioneering leader. Constantine, I think from the minute he got here, knew what he wanted to build. He wanted to build an academic unit, ultimately the Department of Biostatistics, that didn't exist at Brown. There isn’t a Department of Statistics at Brown. 

man points at equations on board
Professor Constantine Gatsonis has been teaching Biostatistics at Brown since 1994.

Johns Hopkins and Harvard have had biostatistics departments for, I think, close to 100 years, if not more. And in the span of 25 or so years, under Constantine's leadership, we've put together a department that is now like, knocking at the door of the top 10. We're incredibly proud of that.

We have been able to attract some of the best young scholars around the country. I think that's a huge endorsement of what we're doing. The very best Ph.D. graduates, they have choices. When they choose us, that's the ultimate endorsement of the department.

And now we're a group of 18 and growing. We have a graduate program that serves Ph.D. students. We have a master's program that serves about 40 master's students. Now, under the leadership of Jen Nazareno and the School of Professional Studies, we're starting an online master's program in biostatics.

It's going to reach even more people. We want to grow the field, grow the profession. More and more data is in the wild and we need people with these skills. We need people who know how to put the puzzle pieces together. These are the people we want to be training. Public health and medicine need more statisticians. This expertise keeps the knowledge base cutting edge, keeps the findings that we come up with first rate, rigorous, reproducible.

Biostatistics is a cornerstone of Brown’s School of Public Health as one of its four academic departments. How do you collaborate across the school?

Most public health research generates data of some kind or another. And public health research requires careful thought, especially when designing a study: how to collect data, how to analyze that data. A lot of researchers are trained in basic data analysis; they're conversant in it and what they want to learn from their data, what kind of models they might want to fit. 

But we're considered the go-to group for researchers who want to get the most out of their data, who want to make the most efficient use of their data—who want to probe deeply but with rigor.

So we're involved in the collaborative process really from start to finish with researchers in the school. We sit down with them, we first try to understand what question they want to answer, what resources they have available. We map out a strategy for collecting or identifying the right data, a strategy for modeling the data. And then, most importantly at the end, after the work is done, we find a way to explain what it all means. What does it mean that this model has this result? And we participate in writing papers and reports, and giving lectures. 

Tell us about the NextGen Scholars program.

In biostatics, we don’t have as many Black and brown students as we should. In particular, African American students are vastly underrepresented in our profession. That's changing. I give a lot of credit to our own professional societies, particularly the Biometric Society and the American Statistical Association have really gone all in on diversifying our pool of students.

We have a lot of work to do. So with support from the school, particularly Dean Jha, the year after the Health Equity Scholars program was launched, we launched the NextGen Scholars program.

The NextGen Scholars program is targeted at students and graduates of historically Black colleges and universities around the country. There're a hundred and nine HBCUs and we’re building partnerships with a number of them to really let undergraduates know what biostatistics is. There're a lot of really bright math, computer science, engineering students at these schools. If I think about my own experience as a math major at UConn, I had no idea what biostatistics was. I didn't even know what it was when I went to graduate school.

So part of the mission is to let people know what biostatistics is. Why is it part of the data science spectrum? And how is it that you can use your skills in math, computer science, engineering, or your quant field to make a difference in the world?

We have scholarship aid for students admitted to our master's program. Those students from HBCUs get tuition and a stipend and they also get a summer internship experience. The goal of the program is to reach people who traditionally wouldn't think about making biostatistics into a career and bring them into our field.

So far, the students that have entered the program are surprised that biostatistics is data science, in a good way. We'll be graduating our first class this year. It's amazing. We're very committed and we think it will make a difference. We named it Next Gen because we really think the next generation of biostatisticians should look a lot different than it does now. We want there to be a lot more African American and Latino statisticians.

What are the most satisfying moments for you in this job? What do you really enjoy?

It's satisfying knowing that I can bring a level of rigor to solve complicated and meaningful problems. Knowing that the knowledge I possess can help people, add rigor to findings, wrap them tightly and know that they've gotten the most out of their data. But the enjoyment comes from the personal interaction. I knew at some point I wasn't a person who could just work alone and solve problems by myself. So I'd say that the Kenyan collaboration has been deeply gratifying.

I've had people say to me, You know, you changed the course of my life. And I can say it right back to them. This collaboration, it literally changed the course of my life and career. I look at the world differently, I experience the world differently. It's really been that gratifying. And I think anybody who teaches students will tell you—and this is what really motivated me to be a high school teacher—is that you can see changes happening in front of you.

You can see the impact that you have. I think a real gratifying part of being an educator is that you can see the impact you're having on people. And, at the same time, it's reciprocated. Not in exactly the same way, but the different students I've had, different collaborators I've had, left an imprint on me. That means a lot.