Willful Ignorance: The Mismeasure of Uncertainty – An interview with author Herbert I. Weisberg

This month Wiley is proud to publish Willful Ignorance: The Mismeasure of Uncertainty by Herbert I. Weisberg.

This book provides an original account of willful ignorance and how this principle relates to modern probability and statistical methods.

Through a series of colorful stories about great thinkers and the problems they chose to solve, the author traces the historical evolution of probability and explains how statistical methods have helped to propel scientific research. However, the past success of statistics has depended on vast, deliberate simplifications amounting to willful ignorance, and this very success now threatens future advances in medicine, the social sciences, and other fields. Limitations of existing methods result in frequent reversals of scientific findings and recommendations, to the consternation of both scientists and the lay public.

Weisberg exposes the fallacy of regarding probability as the full measure of our uncertainty. The book explains how statistical methodology, though enormously productive and influential over the past century, is approaching a crisis. The deep and troubling divide between qualitative and quantitative modes of research, and between research and practice, are reflections of this underlying problem. The author outlines a path toward the re-engineering of data analysis to help close these gaps and accelerate scientific discovery.

The book also presents essential information and novel ideas that should be of interest to anyone concerned about the future of scientific research. The book is especially pertinent for professionals in statistics and related fields, including practicing and research clinicians, biomedical and social science researchers, business leaders, and policy-makers.

Here Statistics Views interviews author Herbert I. Weisberg who describes his career and what we can expect from this exciting new work.

 

1. Congratulations on the upcoming publication of your book, Willful Ignorance: The Mismeasure of Uncertainty which is an original account of how willful ignorance is the principle underlying modern probability and statistical methods. How did the writing process begin?

I have long felt strongly that quantitative research in the social and biomedical sciences was naïve in its almost exclusive focus on average effects in broad populations. In my previous book Bias and Causation, I explored the fundamental logic of comparative studies, but without considering the effect of limited sample sizes and random variability. In the last chapter of that book, I touched on the development of modern statistics in the early twentieth century. However, I realised that to truly understand the nature and limitations of current statistical methodology would require a much more thorough investigation of the historical context in which Fisher and his followers were working. In particular, it would be necessary to critically examine why and how mathematical probability had come to play such a central role with respect to the analysis of data.

With encouragement from my editor, Steve Quigley at Wiley, I proposed a book on this subject that with many twists and turns eventually turned into Willful Ignorance. I then spent over a year conducting the background research on the history, which was fascinating and full of surprises. During this process, my ideas continually evolved and the shape of the book gradually emerged.

2. Who should read the book and why?

I think there are two overlapping potential audiences for my book. First, it should appeal to anyone with a general interest in the history of ideas. Second, it will be of interest to producers and consumers of research in the biomedical and social sciences. Throughout this book I deal with questions that are rarely articulated any more. What does probability really mean? Is mathematical probability a discovery or an invention? Where did our modern concepts of probability and statistics come from? Why was mathematical probability so late in developing?

In the course of my research, I discovered many little-known facts and some widely-held misconceptions about the evolution of mathematical probability. The real story of how mathematical probability became the sole measure of uncertainty is fascinating. But more importantly, it raises the question of whether our current ideas remain adequate for the challenges and opportunities of the data-rich world we are moving into? This question is critically important because, I believe that, advances in genetics, neuroscience and information technology are creating possibilities for moving beyond general average causal effects toward increasing individualization.

3. Throughout the book, you trace the historical evolution of probability and explain how statistical methods have helped to propel scientific research. However, you suggest that the past success of statistics has depended on willful ignorance, and this very success now threatens future advances in many fields. Was this one of your original objectives in writing this book to alert readers to this threat?

Yes. The book is about the evolution of the concept of probability, which is one of the great intellectual achievements of the past three centuries. Ironically, though, probability is in an important sense incomplete. Our modern version of probability grew out of an earlier idea that was more encompassing but less precise. I believe this archaic sense of probability survives most notably in our courtrooms today, but not in our science. As a result, much of the research in social and biomedical sciences has become superficial and increasingly inadequate to deal with the challenges that lie ahead.

4. You argue that statistical methodology is approaching a crisis. The divide between qualitative and quantitative modes of research, and between research and practice, are reflections of this underlying problem. In the book, you outline a path toward the re-engineering of data analysis to help close these gaps and accelerate scientific discovery. Could you please tell us more about this?

Statistical methods based on probability theory are focused on populations rather than individuals. Consequently, there is a paradox inherent in statistical analyses, especially those that relate to human populations. To reach a general inference about some aspect of human health or behaviour, we must suppress some specific information about individuals. The critical question is how much of such “willful ignorance” is optimal? Neither a purely clinical (qualitative) perspective nor a purely statistical (quantitative) perspective is ideal. I argue that science at its best entails a dynamic interplay between these two extremes. The key is to use everything we know to formulate concepts and hypotheses, but then to rigorously test the implications through repeated independent validation. Like Fisher, I believe that good science comes from sound logical and creative reasoning, not from mechanical application of formal procedures. Statistical methods are tools to be utilised mindfully, not rigid rules to be followed slavishly.

5. What is it about the area of data analysis that fascinates you?

I suppose I have always been intrigued by puzzles. Data analysis often entails the fun of trying to obtain a clever solution to a challenging conundrum, especially one that may have baffled others. Plus, the results can be useful, and sometimes even lucrative. I often joke about “torturing the data into confessing its secrets.” The process of researching and writing Willful Ignorance felt in many ways similar, as I struggled to elucidate how exactly modern probability came to be.

6. Why is this book of particular interest now?

Quite simply, there is a radically new reality emerging from rapid developments in information technology. The extent of potentially useful data is expanding exponentially. Yet, our fundamental ideas about uncertainty and data analysis have changed little for decades. Transitioning from a world in which data is scarce and inaccessible to one in which it will be voluminous and ubiquitous presents a major challenge for statisticians. Our statistical methods must adapt to this changing situation.

In the course of my research, I discovered many little-known facts and some widely-held misconceptions about the evolution of mathematical probability. The real story of how mathematical probability became the sole measure of uncertainty is fascinating. But more importantly, it raises the question of whether our current ideas remain adequate for the challenges and opportunities of the data-rich world we are moving into? This question is critically important because, I believe that, advances in genetics, neuroscience and information technology are creating possibilities for moving beyond general average causal effects toward increasing individualization.

7. What were your main objectives during the writing process? What did you set out to achieve in reaching your readers?

My attitude throughout the writing process has been somewhat schizophrenic. On the one hand, I wanted to convey some of the excitement I felt as I came to understand the history of probability as a concept. From this perspective I was a kind of accidental historian trying to tease out a coherent narrative while sidestepping nearly all of the technical aspects that have been the focus of much previous historical writing about probability and statistics. On the other hand, my purpose was to shed light on the dilemma faced by current research. Applying willful ignorance made modern probability and statistics possible, but also set its limits. So, willful ignorance became a fulcrum around which both the past and the future of data analysis seemed to hinge. How much knowledge is enough? That is the basic question that any attempt to deal with uncertainty must ultimately address. I guess that my main objective has been to raise this critical issue back into conscious awareness.

8. Were there areas of the book that you found more challenging to write, and if so, why?

The historical discussion that comprises most of the book was fairly straightforward once I recognized the need to focus on how the idea of mathematical probability (as opposed to the mathematics) gradually emerged and evolved. But the introductory and concluding chapters required major rethinking and rewriting as I went along. In these, my personal opinions may be somewhat provocative and controversial, so I wanted to be as clear as I could be to avoid misinterpretation. While I am highly critical of some current aspects of modern statistical theory and practice, I am optimistic about the power of data analysis, including statistics, to improve the human condition. However, it will be important to understand the limitations of current methodological orthodoxy and to transcend them.

9. What will be your next book-length undertaking?

Having spent a good chunk of the last 7 years writing two books, I am due for a “sabbatical” year or two. If I do pick up the pen again, it might be to collaborate on a book about model validation. This is terribly important but almost entirely overlooked in the statistical literature. Or, maybe I’ll try my hand at a novel – possibly a mystery that revolves around probability somehow.

10. Please could you tell us more about your educational background and what was it that brought you to recognise statistics as a discipline in the first place?

I had always loved math and majored in it as an undergraduate at Columbia. When I graduated from college in 1965, I expected to pursue an academic career as an applied mathematician. However, two elective courses in statistics taken as a senior led me to consider statistics as an alternative. The field was then rather in its infancy, and it turned out that getting accepted by a good graduate school was relatively easy and came with a nice stipend. That turned out to be one of the best decisions I ever made. In the short term, graduate school at Harvard was prestigious enough to please my parents, and it also kept me out of the draft for the Vietnam War. Longer term, it led to a highly rewarding career as a statistical consultant. On the downside, it was many years before my family and friends had an inkling of how exactly I was supporting myself.

11. Can you tell us about Causalytics, LLC, which you founded to develop innovative technology for predictive analytics in medical research and business applications?

Along with a partner, Victor Pontes, I founded Causalytics as a start-up about a year ago. The idea for a business grew out of research related to personalized medicine. In Willful Ignorance, I advocate a new approach toward data analysis for discovery. I believe the time has come when exploratory data analysis in the spirit of Tukey’s pioneering efforts 50 years ago can finally be implemented effectively. Through Causalytics, we hope to develop commercially viable algorithms and models that exploit new data resources and computational capabilities becoming available. We are especially interested in methods to analyzeanalyse genomic and biomarker data more effectively.

12. Are there people or events that have been influential in your career?

There have been quite a few actually, so I can only name a few of the most important ones.

My thesis advisor at Harvard, Art Dempster, first stimulated my interest in the history and philosophy related to probability. His pioneering work related to broadening the scope of mathematical probability inspired my dissertation, and he introduced me to the much-misunderstood ideas of R. A. Fisher. In graduate school I greatly admired Prof. William G. Cochran, whose courses were models of clarity and practicality, delivered with great charm and wit. It tickled me as well that he had collaborated with Fisher, so there was only a single degree of separation between me and the originator of modern statistics.

Three years as an assistant professor at NYU convinced me that I was not cut out for academia. So, I was lured back to Cambridge by Marshall Smith, then at the Huron Institute, leading to a fruitful collaboration with Tony Bryk (then a promising graduate student) that exposed me to the world of program and policy evaluation. My research there on early childhood education got me thinking about both the promise and limitations of statistical methods. I was also influenced by the work of Donald Campbell and Lee Cronbach during this period. Over the years, I have increasingly come to appreciate their insightful perspectives on data analysis, including but not limited to statistics.

The remainder of my career has been spent as a statistical consultant. Colleagues who have been particularly important to me are Mike Meyer, Jarvis Kellogg, Richard Derrig, and Bruce Parker. I am grateful to each of them for connecting me with an area of research and/or consulting that has been professionally fulfilling. Finally, without the gentle prodding by my two editors at Wiley, first Bea Shube and now Steve Quigley, I would never have had the nerve to become a serious writer.