Causality in a Social World: An interview with author Guanglei Hong

Next month, Wiley is proud to publish Causality in a Social World: Moderation, Mediation and Spill-over, which introduces innovative new statistical research and strategies for investigating moderated intervention effects, mediated intervention effects, and spill-over effects using experimental or quasi-experimental data.

The book uses potential outcomes to define causal effects, explains and evaluates identification assumptions using application examples, and compares innovative statistical strategies with conventional analysis methods. Whilst highlighting the crucial role of good research design and the evaluation of assumptions required for identifying causal effects in the context of each application, the author demonstrates that improved statistical procedures will greatly enhance the empirical study of causal relationship theory.

Applications focus on interventions designed to improve outcomes for participants who are embedded in social settings, including families, classrooms, schools, neighbourhoods, and workplaces.

Statistics Views talks to author Guanglei Hong, Associate Professor at the University of Chicago about writing the book.

 

1. Congratulations on the publication of your book, Causality in a Social World: Moderation, Mediation and Spill-over which introduces innovative new statistical research and strategies for investigating moderated intervention effects, mediated intervention effects, and spill-over effects using experimental or quasi-experimental data. How did the writing process begin?

Several friends and colleagues of mine suggested to me a few years ago that I should write a book on causal inference for social scientists. The argument was that, being a social scientist myself, I may bring to causal inference research, some important insights gained in studying human development and society and may have the privilege of knowing how to communicate the nuance of causal inference concepts and methods to other fellow social scientists. At the time I was uncertain about how to allocate my time between discovering something new and summarizing what I have already learned. A discussion with Debbie Jupe, Commissioning Editor from Wiley, at the Joint Statistical Meetings in 2011 led to a book proposal that received very encouraging reviews. The book project was thus launched. Interestingly, I found myself continuing to search for a balance between exploring new territories and solidifying old wisdom throughout the writing process.

2. What were your main objectives during the writing process?

In writing this book, my main objectives were to clarify for applied researchers, the theoretical concepts of multivalued treatments, moderated effects, mediated effects, and spill-over effects. For example, I have seen very often that the term “moderation” is misused and confused with the term “mediation” in the published behavioral and social science research. By introducing the framework of potential outcomes, I hope to clear some of the terminological ambiguity and conceptual confusion. In addition, the book systematically introduces innovative statistical strategies for investigating these causal effects and aims to make them readily accessible to a wide audience. I placed a major emphasis on explicating and evaluating, in the context of real applications, the assumptions required for relating causal parameters of interest to empirical data given a specific research design. Formalizing causal reasoning and distinguishing it from pure descriptive analysis, I believe, is essential for advancing the empirical basis of knowledge building.

3. Whilst highlighting the crucial role of good research design and the evaluation of assumptions required for identifying causal effects in the context of each application, you demonstrate in the book that improved statistical procedures will greatly enhance the empirical study of causal relationship theory. Professor Judea Pearl said in a video interview here on Statistics Views that “Causal inference begins in the crib” and that “the quest for causal explanation is as old as human civilization. What has been delayed, however, is the ability of mankind to mathematize it. For some strange reasons which I am still exploring, it took 2000 years to develop, and only recently we have acquired a mathematically language in which we can express what we want to find out, what we are willing to assume about reality, put the two together with data and come up with an answer.” Would you be in agreement and how do we need to improve our statistical procedures?

I cannot agree more with Professor Judea Pearl’s observation. I started the book with an ancient Chinese fable in which a farmer, upon making the discovery that taller crops tended to produce more yield, worked a long day to pull his seedlings upward. Of course, all his plants wilted and died the next day. This experiment, though costly, should advance the farmer’s knowledge of “what does not work.” We can find numerous analogies in today’s world in which well-intentioned interventions fail to produce the intended benefits or even breed unintended consequences. Much is to be learned about whether an intervention works, which version of the intervention works, for whom, under what conditions, why it works, or why it does not work. I agree that every scientific mind in training should develop the capability of mathematizing causality. Without disciplined reasoning, a causal inference would slip into the rut of a “casual inference” as easily as making a typographical error. This is also true when sophisticated statistical procedures are put into misuse especially when data are inadequate.

Formalizing causal reasoning and distinguishing it from pure descriptive analysis, I believe, is essential for advancing the empirical basis of knowledge building.

4. Throughout the book, applications focus on interventions designed to improve outcomes for participants who are embedded in social settings, including families, classrooms, schools, neighbourhoods, and workplaces. If there is one piece of information or advice that you would want your author to take away and remember after reading your book, what would that be?

I must restate the comment from one of the anonymous reviewers who cautioned that “causality cannot be established on a pure statistical ground.” Yet I maintain that improved statistical procedures along with improved research designs would greatly enhance our ability to reveal causality in a social world.

5. Who should read the book and why?

The book is suitable for use in introductory or advanced level applied research methods courses for postgraduate and undergraduate students. When used in an introductory level course, the book teaches how to conceptualize causal relationships, how to relate social scientific theories to empirical evidence, what data to collect, how to analyse the data, what assumptions to examine, and what are the strengths and limitations of alternative analytic strategies. For advanced level students conducting original research, the book highlights methodological challenges often encountered in social scientific inquiry along with innovative analytic tools that will greatly expand their research capacity. The book may also serve as a reference for academic researchers and public policy analysts.

6. Why is this book of particular interest now?

Over the years, I have noticed that psychological research often presents well-articulated theories of causal mechanisms relating stimuli to responses. Yet such research is often short of analytic strategies for empirically screening competing theories that explain the observed effect. Sociologists have keen interest in the spill-over of treatment effects transmitted through social interactions yet have produced limited evidence quantifying such effects. A new framework for conceptualizing moderated, mediated, and spill-over effects has emerged relatively recently in the statistics and econometrics literature on causal inference. Among social scientists, awareness of methodological challenges in causal investigations is higher than ever before. In response to this rising demand, causal inference is becoming one of the most productive scholarly fields in the past decade. More importantly, the pursuit of causal inference is a multidisciplinary endeavour. Substantive inquires and methodological advances can always feed into each other. It is my hope that the book will instigate a great amount of interdisciplinary exchanges between statisticians and applied researchers.

7. Were there areas of the book that you found more challenging to write, and if so, why?

The challenges that arose in writing this book were similar to the challenges that I have encountered in teaching quantitative methods to a group of students diverse in their previous statistical training. It would be risky to assume that the student—in this case, the reader—has already acquired a solid understanding of all the basic elements of statistics; yet it would also be infeasible to teach all the basic elements in addition to the new materials that the book needs to focus on. Hence I was constantly wary of losing my readers whenever I moved into a relatively advanced topic. I did, however, find someone who had minimal statistical training to read a draft of every chapter and used her feedback to identify places where additional scaffolding might be helpful.

8. What is it about the area of causality that fascinates you?

To me, causality is driven by the human desire to get to the bottom of everything and make the reasoning transparent. This is the essence shared by all scientific undertakings; and this is what makes “evidence” trustworthy in guiding evidence-based decision-making. Causal inference is a business about minimizing ambiguity and then using statistical inference to accurately quantify uncertainty. Even the most sophisticated statistical procedures cannot avoid generating completely ambiguous and often misleading results if the data analyst is unclear about possible differences between the causal parameter of interest and what her analysis is producing. This is often true regardless of how “big” or how “small” her data are. Lord’s paradox and Simpson’s paradox are famous examples that I use in the book to show how causal inference helps reveal important loopholes in human cognition. How to help social scientists address important and often complex causal questions without being misled by spurious patterns in the observed data? That defines a field of research in need of a lot more manpower than we currently have.

Causal inference is a business about minimizing ambiguity and then using statistical inference to accurately quantify uncertainty. Even the most sophisticated statistical procedures cannot avoid generating completely ambiguous and often misleading results if the data analyst is unclear about possible differences between the causal parameter of interest and what her analysis is producing.

9. What will be your next book-length undertaking?

To be frank, the next thing I will be eager to do is to update the current book as I continue to gain new understanding and make progress with my colleagues in our research.

10. You are currently an Associate Professor with tenure in the Department of Comparative Human Development at the University of Chicago. Please could you tell us more about your educational background and what was it that brought you to recognise statistics as a discipline in the first place?

When I was half way into my PhD program in education at the University of Michigan, I realized that, although I had been shopping for a wide range of quantitative methods courses, what I ultimately needed was an understanding of the foundation of the statistics discipline. Without such foundational knowledge, I would simply be a consumer of statistical software programs barely aware of what is happening behind the scene. So I made up my mind to take the required courses for those major in statistics and biostatistics and became one of the first graduates of the new joint program in education and statistics. Teaching the applied statistics courses, especially the introductory-level course, brought me an even deeper appreciation of the coherence of the statistical knowledge system. More excitingly, I have found that there is a vast room for new creations in statistics. This excitement has led to the development of the new analytic strategies that I presented in this book.