“I love genetics because it combines biology with mathematics”: An interview with Dr Alice Whittemore

Dr Alice Whittemore is Professor of Health Research and Policy (Epidemiology) and of Biomedical Data Science at Stanford UniversitySchool of Medicine. Her research focuses on statistical methods for epidemiological studies of site-specific cancers, particularly cancers of the prostate, ovary, and breast. She is a member of the Institute of Medicine, a Fellow of the American Association for the Advancement of Science, a Fellow of the American Statistical Association, and a member of the American Epidemiological Society. Dr. Whittemore has received numerous awards for her work, and she has served on many advisory boards for the National Institutes of Health.

Alison Oliver talked to Professor Whittemore during a recent stay in the UK where she was attending a conference honouring her husband, the world-renowned mathematician Professor Joe Keller who sadly passed towards the end of last year. Here she discusses her career in epidemiology, her research interests and her work with the Breast Cancer Registry.


1. You gained your B.S in mathematics from Marymount Manhattan College followed by an MA and PhD also in mathematics from the City University of New York. What was it that first introduced you to statistics as a discipline and what was it that led you to pursue epidemiology and biostatistics as a career instead of mathematics?

That’s a story I enjoy telling. After receiving my PhD in mathematical group theory, I was doing research and teaching at Hunter College of the City University of New York. My research wasn’t going well, and I began realizing that the big names and the deep thinkers in group theory were working on major problems in the field, whilst I was working on such a remote tributary that it was questionable whether more than six or eight people in the world even knew what I was doing.

At about that time, the mathematics department at Hunter College decided to initiate a master’s degree program in statistics, and I thought, what the heck, I’d like to teach a course in this new program. I had never had a statistics course. I hardly knew what statistics was, but I had been well trained in mathematics. The text assigned for my class was Introduction to Mathematical Statistics by Hogg and Craig and it was wonderful. I still have it and refer to it to this day. The authors were extremely rigorous, organized and clear. But I was literally three pages ahead of the class in reading the text, and I would do the exercises and then assign them.

I never wanted to teach calculus again, I never wanted to do pure mathematics again. About that time, I was divorced, a single mother, and each summer I’d put my two young daughters in camp whilst I attended programs in applied maths, economics and operations research. At one of those programs, I learned of a two-year fellowship program allowing pure mathematicians an introduction to applied problems by transplanting them from academic institutions to applied settings, and where they could be confronted with applied problems. Here pure mathematicians could do something useful for a change! So I decided to apply and I was accepted into this program – in fact I think I was the first such “transplant”.

The program, which was sponsored by the US Society for Industrial and Applied Mathematics, offered training at about ten different applied settings in the US. One was the New York City fire department, another was the Department of Genetics at Stanford University. They were all enticing and interesting opportunities, but I didn’t want to yank my daughters out of school to go elsewhere for two years. We were living in New York and I was intrigued by an opportunity at NYU Medical Center, because I’ve always loved biology and was interested in medicine. So I decided to do my fellowship in the Department of Environmental Medicine at NYU Medical Center, working on problems in epidemiology and biostatistics. I was hooked on epidemiology by reading about the work on smoking and lung cancer by Sir Richard Doll, an eminent epidemiologist at the Radcliffe Infirmary at Oxford. The sponsors of my fellowship needed a senior applied mathematician to oversee my work and make sure I wasn’t too overwhelmed with the medical problems. For this they recruited Joe Keller, a world-famous applied mathematician at the Courant Institute of Applied Mathematics at NYU. Joe accepted the job and took it seriously. I would visit his office once a week and report to him my applied investigations. We would write on the chalkboard and stare at various equations. One evening it was late and we were both hungry, and he said, “Do you want to get a bite to eat?” We went out to a Chinese restaurant, and that was the beginning of a love of work, a love of outdoor activities, and a love of each other that was to last until he died this past September. I’m here in England now because of a mathematical conference in his honour in Cambridge, and there will be several more in the US, as well. So I was taught by the very best.

Upon the completion of my two-year transplant fellowship, Joe and I decided to take sabbatical leaves at Stanford University for another year. We found Stanford to be a stimulating and exciting place to work, and we pondered whether to remain or to return to our careers at Hunter College and the Courant Institute. The investigators in Stanford’s Mathematics Department were thrilled at the possibility that they might attract Joe, but nobody quite knew what to do with me.

However I got lucky again, because the medical school began to see the relatively new field of epidemiology as important for medical practice. They recruited a distinguished epidemiologist, Ralph Paffenbarger, who needed a biostatistical collaborator, and we hit it off. Yet, although I had been a tenured professor at Hunter College, in this new environment I received only an adjunct research position. Despite this setback, we stayed at Stanford. I was busy working and doing projects on lung cancer among uranium miners, and other interesting epidemiological projects. I began working with investigators at the US National Institute of Occupational Safety and Health, which was heavily involved in environmental medicine, and I was happily engaged with writing papers and giving invited talks. One talk invitation was from Harvard, which prompted my colleagues there to offer me a tenured position. I nearly fell off the roof. When others learned about the possibility of our moving to Boston, Joe was offered a position at the nearby Massachusetts Institute of Technology (MIT). So we were planning to go to Boston if Stanford did not offer me tenure.

I recall that one of the academic leaders at Stanford was a woman who went to bat for my case, and alerted the provost about the situation. When the provost learned that the institution might lose both of us if I didn’t get tenure, he paid the medical school dean a personal visit, which is highly unusual, to beg our case. Although epidemiology was a relatively new field unfamiliar to the dean, he took a chance and offered me a tenured position. So my career at Stanford did not follow the usual path of appointment as assistant professor and rising through the tenure ranks. Instead, it was rather unorthodox.

My conversion to research in genetic epidemiology began when learning about families prone to developing a certain cancer, in particular the genetic insights available from studying allele sharing by affected sibling pairs. It was as if lightning flashed, and I thought, “I really want to learn enough genetics to do these studies.” I love genetics because it combines biology with mathematics. For example, we inherit one copy of, say, chromosome 1 from our mother and one from our father. So apart from our egg or sperm cells, each of our cells carries two copies of each chromosome. And it’s a matter of chance (the flip of a coin) which of our parents’ two chromosomes are transmitted to us. So allelic inheritance is random, but with well-defined probabilities, and the statistics are fascinating because you know what the rules are, and you can deduce the consequences of combining the rules with the laws of chance. That’s exactly what applied statistics is all about. The realization that genetic epidemiology was right for me was so mind-blowing that I had to stop working, leave the library, and take a walk around the block for a while just to mull it all over.

2. You joined the faculty at Stanford in 1978. What is it about Stanford that you love that has kept you there?

It’s a relatively small and compact university, geographically. It’s beginning to spread out more now, but it’s been easy to find expertise in biology, mathematics, statistics, engineering, medicine – any field in which one might need consultation. Also, Joe and I had both been New Yorkers who loved to hike, but to find trails near New York City, we’d have to rent a car and drive a long way to reach a state park. In contrast, we were amazed at the abundance of beautiful wilderness hiking trails within walking distance of our campus house at Stanford. The weather is beautiful, and it offers such a wonderful outdoor life.

Stanford is an outstanding place to live and work. I think Joe always regretted leaving New York and the Courant Institute, because his close colleagues and former students were there, and he was very much an outgoing extrovert. He loved people and he missed that close companionship. I, on the other hand, immediately met women who became close lifelong friends and I’m an introvert; I don’t really need people the way Joe did.

One of my new friends was also a cancer epidemiologist. Together, we applied for five NIH grants—this was when such grants were relatively easy to get—and to our amazement, we got all five of them. I had to hire staff and I had no place to put them! I remember some of the computer programmers were sitting on the floor with their computers on their laps until I could get the desks and chairs for them. It was just frantic, but a very happy, busy time.

So apart from our egg or sperm cells, each of our cells carries two copies of each chromosome. And it’s a matter of chance (the flip of a coin) which of our parents’ two chromosomes are transmitted to us. So allelic inheritance is random, but with well-defined probabilities, and the statistics are fascinating because you know what the rules are, and you can deduce the consequences of combining the rules with the laws of chance. That’s exactly what applied statistics is all about. The realization that genetic epidemiology was right for me was so mind-blowing that I had to stop working, leave the library, and take a walk around the block for a while just to mull it all over.

3. Your main research concerns have centred on cancer, in particular on recently developing improved statistical methods for the design and conduct of studies involving hereditary predisposition and modifiable lifestyle characteristics in the etiologies of site-specific cancers. Are you working on anything in particular at the moment?

I am working with a dermatologist and a geneticist who are affiliated with the Kaiser Permanente Health Care (KPHC) system in Northern California, which treats and maintains the health records of hundreds of thousands of patients in the region. The KPHC records provide an outstanding epidemiologic resource, because many KPNC members have agreed to complete questionnaires about their lifestyle and provide DNA for genetic analysis, which can then be linked with computerized health outcomes. So the attributes of those who develop a disease like breast or prostate cancer can be compared to those who do not.

My KPHC colleagues have assembled a registry of all of the skin cancers diagnosed among KPHC members since 2007. We have been using this registry to study the genetics of cutaneous squamous cell cancers (cSCCs). What makes these analyses challenging is that susceptible people can develop many such cancers. The medical community is trying to personalize cancer prevention, to save people from overly intensive screening without missing those who do develop a cancer. My colleagues and I are trying to facilitate that goal for several cancers. In particular, we are analyzing KPHC patients’ personal attributes and cSCC histories to develop a model that clinicians can use to stratify the screening of their patients according to their personal risks.

4. Your lecture at JSM 2016 was the COPSS Awards and Fisher Lecture and your chosen title is Personalizing Disease Prevention: Statistical Challenges. Please could you tell us more about this topic? What is the one thing that you would like your audience to take away from your lecture?

I spoke about the statistical opportunities and challenges involved in developing and evaluating personal risk models to inform disease prevention strategies. Such risk models use a person’s covariates to assign him/her a probability of developing a disease in a given future time period. They need validation with longitudinal cohort data to determine if those with high model-assigned risks really do have high disease incidence compared to those deemed at low risk. An accurate model is said to be well-calibrated to the incidence in the cohort. To check a model’s calibration, the statistician assigns model risks to the cohort subjects using the risk factors they report at cohort entry, and then compares these assigned risks to the subjects’ disease incidence in subsequent follow-up. The main message from my talk was the wealth of interesting statistical problems involved in using cohort data to evaluate the utility of personal risk models.

5. In recent years, you have been active in helping to establish the Cooperative Family Registry for Breast Cancer Studies. Please could you tell us more about this project?

I’ve been involved in the Breast Cancer Family Registry (BCFR) since the National Cancer Institute first organized it nearly 20 years ago. My colleagues and I were instrumental in designing a two-stage sampling design to efficiently and inexpensively oversample those women at high risk of breast cancer due to a strong family history (FH) of the disease, while still allowing inferences about all women in the underlying population.

Suppose you took a simple random sample of say 10,000 women; maybe only 10% (or 1000) of them will report a FH of breast cancer. You want to focus on these women without sacrificing generalizability to all women in the underlying population. Two-stage sampling designs allow you to do this by sampling all the 1000 FH-positive women but only, say, 10% or 900 of the remaining 9000 FH-negative women. To analyze such data without losing generalizability, your analysis would assign the FH-positive women weights of one, but assign the FH-negative women weights of 10, because each FH-negative woman must represent herself plus nine other unsampled FH-negative women.

When your goal is to estimate a parameter of interest, this design and analysis allows you to infer what your estimate might have been had you taken a simple random sample. Of course, the variance of this two-stage estimate is larger than that of the more costly estimate based on randomly sampling and analyzing all 10,000 women. But there are accurate closed-form variance estimates that account for the more limited two-stage sampling. Such sampling can be economical and efficient, and we have found it useful in analyzing BCFR data.

Working with these families and analyzing their data continues to be a lot of fun. We just submitted another major renewal application for that consortium. And we’re now following the daughters of women in these multiple-case families.

6. You have authored many publications. What is the article or book that you are most proud of?

That’s an interesting question. I love everything that I work on, and when I’m working on it, I am totally engrossed in it, as if it were the most important thing in the world. That said, I’m particularly proud of a paper showing a link between an informal method that geneticists have been using for nearly a century and well-established statistical techniques. This allowed other investigators to extend the informal method in useful ways. I’m also proud of another project in which I provided an easy way to determine the sample size needed to achieve adequate power to test a hypothesis concerning binary data.

Both of these papers were well appreciated, and its been rewarding to see other investigators use the results and then move on to other extensions.

7. What has been the best book on statistics that you have ever read?

My first statistical book, Introduction to Mathematical Statistics by Robert V. Hogg and Allen T. Craig is my all-time favourite. I absolutely love it, and I continue to consult it today. The Statistical Analysis of Failure Time Data by John D. Kalbfleisch and Ross L Prentice is a well-worn and beloved guide to much of my statistical work. Another book that I love, as I’m now working with it intensely and appreciating the thought that went into it, is The Statistical Analysis of Recurrent Events by Richard J. Cook and Jerald F. Lawless. I am finding this book essential as my colleagues and I struggle with the data on multiple squamous cell skin cancers.

8. Who are the people that have been influential in your career?

Definitely my husband, Joe Keller, by far and away. My PhD mentor, Gilbert Baumslag who was very crucial in my life at that time – I was buoyed by his insouciance, his wit, his love of group theory, and his effervescence. Ralph Paffenbarger, the Stanford epidemiologist known for his pioneering work on the value of physical activity for human longevity, who influenced me and provided much mentoring and support. But also many other major leaders in cancer epidemiology (all male) who believed in my promise – Joe Fraumeni at the US National Cancer Institute, and Brian Henderson at the University of Southern California, come immediately to mind.

9. If you had not got involved in the field of statistics, what do you think you would have done? (Is there another field that you could have seen yourself making an impact on?)

I think in retrospect I might have become a physician doing both research and clinical work. When I look back on my career, moving from pure mathematics into biostatistics and epidemiology was not a small smooth transition—it was a huge leap. But it felt so right, and that’s taught me that it’s important not to be afraid to make major career changes. And I never regret for a minute the math education I received, because it’s so conducive to rigorous thinking.