“Statistical control theory is an emerging area that could lead to advances in new statistical theory and ideas”: An interview with Michael Kosorok

Professor Michael Kosorok is W. R. Kenan, Jr. Distinguished Professor and Chair, Department of Biostatistics and Professor, at the Department of Statistics and Operations Research at University of North Carolina at Chapel Hill.

His research has focused on the application of empirical processes and semiparametric inference to statistics and biostatistics. He is a Fellow of both the American Statistical Association and the Institute of Mathematical Statistics, and the author of Introduction to Empirical Processes and Semiparametric Inference.

He is also an Associate Editor for Annals of StatisticsJournal of the American Statistical Association, Theory and Methods and Journal of the Royal Statistical Society, Series B.

During JSM 2015 last month, in front of a full lecture theatre with standing room only, Professor Kosorok led one of the IMS Medallion Lectures in which he discussed his research on a new approach in biostatistics — outcome weighted learning. Statistics Views talked to Professor Kosorok following his outstanding lecture about his career and curent research.


1. When and how did you first become aware of statistics as a discipline and what was it that inspired you to pursue a career in biostatistics?

When I was between my freshman and sophomore year at college, I went home and I worked at Battelle Pacific Northwest Laboratory in Richland, Washington. Part of my job that summer was to find ways to use Markov chains to predict weather. These were very high-dimensional Markov chains and computationally a lot of work, but I started to become very fascinated on the theory of how it worked. These predictions tended to do pretty well which was interesting. I had been majoring in pure math and also music. I returned to the math department and explained to my teachers that this was the kind of problem I would like to work on, so which area of math does this? They looked at me and said, “Well, that is actually statistics”. I was surprised. I had no idea as I had only thought of statistics in terms of sports, accounting, etc. Statistics was much more interesting than I had thought.

So I met with a professor named Dennis Tolley, who teaches at Brigham University who had been doing something very similar. I took my first class from him on mathematical statistics and worked from the Hogg and Craig textbook. I fell in love with the field and the interesting thing that I learnt later is that Tolley obtained his PhD in biostatistics at the University of North Carolina at Chapel-Hill, which is now where I am based as Chair of Biostatistics! Interestingly, Dr. Tolley is currently Chair of Statistics at Brigham Young University as well.

2. You are currently Distinguished Professor and Chair of the Department of Biostatistics, and Professor of the Department of Statistics and Operations Research at the University of North Carolina at Chapel Hill. Over the years, how has your teaching and research motivated and influenced each other? Do you continue to get research ideas from statistics and incorporate them into your teaching?

I’ve had a number of opportunities to teach studies related to my research which has been really helpful. For example, I have taught survival analysis, theoretical statistics and empirical processes which are very closely related to my research. The teaching helps strengthen my ability to do research.

I have also taught consulting and that has helped with collaborative research and so I have always had this inner play between applications and theory. Part of teaching is mentoring doctoral students. I have mentored 35 doctoral students so far in my career and most of my research that I do today involves at least one doctoral student. They help me get more research done; it’s an opportunity to share ideas, some of which come to fruition.

3. Your research interests include cancer, cystic fibrosis, personalised medicine and clinical trials. What are you focussing on currently?

Right now, I am very interested in is precision medicine. One of the things that really intrigues me is the idea of being able to do mHealth with that – where a person uses a smart phone to collect information and also give suggestions, such as treatments for individuals, collect data on these individuals and then combine the data to learn what the best treatment is. You can tell, for example, if an individual has diabetes, how much insulin that individual should take, what they should do about their exercise, diet, etc. I think this is a very interesting open area. It is an area where the kind of data we have and the kind of questions you can answer lend themselves very much to the precision medicine work that I have been focussing on.

Another area I am really interested in is complex decision-making and using data to inform those decisions so that the decision is justified empirically and has good reproducibility properties. The idea of decisions being on a continuum is really fascinating to me.

We have to move through new ideas quite quickly when it comes to Big Data, so I think we can do a better job communicating and also being more willing to adapt.

4. What has been the most exciting development that you have worked on in statistics during your career?

Probably what I am working on right now, which is outcome waited learning and that seems to be opening up to very wide opportunities and possibilities. It could simply be the fact that I am working on it now. I don’t like to stay on the same topic too long. I think a lot of people are starting to think about outcome waited learning and it is going to stay interesting for me for a while.

5. Your research in cystic fibrosis was instrumental in changing public policy so that routine new-born screening is now practiced in essentially all 50 U.S. states. What is the next big change you would like to see facilitated by developments in biostatistics?

Some of the big things that I would like to see happen is statisticians being more involved in the data science revolutions that are going on. Sometimes really bad things are done because statisticians aren’t involved and so we can blame other people for the problems or we can do something about it! Sometimes, we are not as proactive as we should be in communicating with other people what we do and what we can bring to the table; sometimes we are not willing to work in such fast-paced, difficult areas of research where we have to be constantly adapting. We have to move through new ideas quite quickly when it comes to Big Data, so I think we can do a better job communicating and also being more willing to adapt.

There are also a couple of big areas in biostatistics – e.g. statistical control theory is very early in its development, which is an extension of control theory beyond dynamics and stochastic modelling to data-based problem solving. I think this is really interesting and an emerging area that could lead to advances in new statistical theory and ideas. We could make some really interesting advances in medicine but probably in other disciplines as well.

6. Your lecture here at JSM 2015 is the IMS Medallion Lecture III. Could you please tell us about your theme for the lecture and what points you brought across?

The main point I discussed in the Medallion lecture is a particular kind of precision medicine problem that is identifying ways for dividing a population based on feature variables or covariates into groups. Each of these groups receives a different treatment so that the assignment will lead to optimal outcomes for that population.

I mostly discussed a new approach—outcome weighted learning—which is able to convert the problem from an indirect approach, where you have to do regression, followed by inverting the regression, to directly estimating the treatment decision rule. We focused on a special setting where we need to find an optimal dose personalised for the individual.

7. What is the best book on statistics that you have ever read?

Looking back, I would have to say that my favourite book is Empirical Processes by van der Vaart and Wellner. It has been out for a while. I really feel that that is the most influential book for me. There were other empirical process books before but this one captured all the concerns I had about applying empirical process to biostatistics. It dealt with issues where measurability is tricky and it presented coherent ways to handle these issues. The outer measure approach is exactly what we need for a lot of the problems and practical settings that we encounter in biostatistics and so I found it very stimulating. That book actually helped change my research course towards empirical processes.

8. What do you see as the greatest challenges facing the profession of statistics in the coming years?

Being able to adapt and being able to communicate, I have mentioned already but I would also like to say that we need to find a way to produce more statisticians who have the right balance of theory and applications. Part of the problem is that there aren’t that many of us, considering the demand for data science. Somehow we have to find a way to increase our numbers sufficiently so that there are enough people thinking statistically in order to have a really positive effect on data science and also on other changes in science that are occurring.

During my IMS Medallion Lecture, I mostly discussed a new approach—outcome weighted learning—which is able to convert the problem from an indirect approach, where you have to do regression, followed by inverting the regression, to directly estimating the treatment decision rule. We focused on a special setting where we need to find an optimal dose personalised for the individual.

9. Are there people or events that have been influential in your career?

Dennis Tolley is the number one most influential person that I have worked with. Tom Fleming at the University of Washington was my dissertation advisor and is a clinical trials expert. He really got me going in my career. John Wellner, who was the La Cam Lecturer at the Joint Statistical Meetings, who was a member of my dissertation committee, and he and I have been friends for many years. He has been very influential in the way that I think about theoretical statistics and has been tremendously supportive and encouraging to me. He is a role model to me and an unbelievably strong theoretical statistician and a very kind person.

A role model who is not a statistician is the collaborator that I worked with on cystic fibrosis for many years – Phillip Farrell who was Chair of Paediatrics when I started working with him at the University of Wisconsin-Madison. He then became Dean of the Medical School and did an incredibly good job doing administrative work and as a clinical scientist. He was very good to work with because he was very interested in different methods of approach, especially if they answered the right scientific questions. There were several times when could have done something simpler but it would have changed the question that we wanted to ask. He insisted that we figure out the method that was right in order to answer the scientific question and that was a very interesting and important lesson to learn for me.

10. If you had not got involved in the field of statistics, what do you think you would have done? (Is there another field that you could have seen yourself making an impact on?)

The most serious other field that I worked in was music composition. I have both a BM and MM in music composition and was double majoring in music competition and pure mathematics before switching from math to statistics. As I took graduate level statistics courses right off the bat, it was easier for me to become a graduate student in statistics while remaining an undergraduate student in music so when I graduated in 1988 from Brigham Young University, I received both a BM in music competition and an MS in statistics.

I play the French horn and piano but I was more focused on composing. I took the music GRE whilst a PhD student at the University of Washington, thinking that I might go back and be a composer. Then, when I went to the University of Wisconsin for my first faculty position, part of the way working on tenure, I decided that I had to do more music and so I actually worked on and completed an MM in music composition.

Every once in a while, I do some composing and several years ago wrote an orchestral piece for my daughter’s senior year in high school for her and her orchestra to play and premier. I don’t think composing is a practical professional option for me, to be honest, but I could also have gone into computer science, pure math or physics.