While working at Microsoft as a postdoctoral fellow, Abraham Flaxman learned that he loved analyzing big sets of data. Now he uses that passion at the Institute for Health Metrics and Evaluation (IHME), where he works to fill in the huge holes missing from global health data. His innovations – including the creation of a computer model estimating the prevalence of more than 200 diseases – earned him one of MIT Technology Review's "35 Innovators Under 35" awards. That put him in the same company as the founders of Google and Facebook.
You made an unusual transition from Microsoft. What's it been like?
This has been a great direction for me, bringing the tools of the computer scientist into the realm of global health and public health. It was the dawn of big data when I started working at Microsoft Research's Theory Group. When my postdoc came to a close, I started looking around, saying, "How can I keep this going? Where is an area that has real data – big data – and where does it really matter?" Public health jumped out as an example of that. I got really lucky and found a postgraduate research fellowship at IHME. It has worked out really well.
What did you do at Microsoft, and how was it relevant to public health?
One of the first things I did was look at the social networks in Xbox Live. Microsoft had lots of big data: what people are searching for in Bing, what people are clicking when they use the Microsoft toolbar, what advertisements generate the highest click-through rates. There are tons of important Microsoft business questions about all of these things. But the techniques you use to answer them are the same sort of techniques you need to use to say, "What are the chances that someone who's had a heart attack is going to have another event in the next 30 days?"
What are you doing for IHME?
We're trying to measure the world's health, but there are great gaps in our knowledge. We can fill them through additional surveys, by asking questions and getting answers, but that's very slow, expensive and resource-intensive. The other way to fill those gaps is through models and estimates. Mathematics gives you the tools to make those estimates.
What kind of gaps do you find?
One of the projects I've worked on is estimating the coverage of insecticide-treated nets to fight malaria in Africa. Countries have made huge efforts to get bed nets to people who need them. But we don't always know how many nets are really making it into households. The best available source is a household survey where people are hired to go out to a carefully chosen representative sample and knock on doors. But these surveys are done every five years or so. We want to know how that's going yearly, quarterly. So we developed a model to take all of the available data – including reports from national malaria control programs and data from manufacturers – and put it together to make estimates of coverage, which are included in the World Health Organization's annual report on malaria.
What have you learned so far about mosquito-net distribution?
Massive scale-up is possible – that's the main conclusion. There are countries that had very low coverage in the early 2000s that have scaled up to 60, 70, even 80 percent of the country where there is malaria.
What are some of the other challenges with global health data?
Timeliness, and the time it takes to digitize data collected on paper. There are also problems with what you'd call non-sampling error. We rely on people's recall, and sometimes there's an incentive to answer one way or another. The question of whether your child slept under a bed net last night is, in some ways, asking if you've been a responsible parent. There's a multitude of distinct challenges in getting good information.
Tell us about 'verbal autopsies' used to estimate what people die from.
Some countries have no vital registration system, so there's no death certificate and there's not even a place to go to get an incomplete picture. Verbal autopsy is a method that has been developed to fill in gaps. A trained interviewer waits a respectful period of time, then asks someone what their friend or loved one died of and goes through a 40-minute-or-so interview that includes questions about signs and symptoms: Was she coughing leading up to the time of the death? Did she have a fever? Was there any injury related to the death? My work takes the results of these interviews and automatically comes up with a prediction of what the cause of death was. All of this will give us a more accurate, more complete, and more timely picture of what people are dying from.
How confident are you in the accuracy of your estimates?
Now we're getting into a complicated realm of philosophy, really. It's a subtle question when you're making a probabilistic prediction. If the weather report says, "80 percent chance of rain tomorrow," how well did they do? Well, if it rains they were right. If it doesn't rain, they were right. This is actually another area of research for me – determining how to quantify how accurate my predictions are.
What did it mean to you to win the MIT award?
It was really nice to see this work recognized. The MIT folks come from an engineering background, and give awards to people who make things. In global health, awards go to people who save lives. I'm not making things, and I'm not saving lives directly. What I'm doing is generating tools for generating information. To see that honored at the level of a new invention that's going to save lives is exciting. It's a validation that people are valuing information.
What do you do outside of work?
My big project is with me right now, a five-month old baby named Sidney. I had no idea how much I was going to love having a baby. I'm currently on paternity leave. I try to get out and enjoy the Northwest once in a while, by hiking or biking. I love to cook and eat and go to farmer's markets. I've also been writing a lot recently for work. I'm working on a book on the global burden of disease, so I have tried to read a lot of really good writing, but I have also been inspired by the "Game of Thrones."