“I hope that my most exciting work lies in the future”: An interview with Professor Joseph Gastwirth, winner of the Gottfried E. Noether Senior Scholar Award

Features

  • Author: Statistics Views
  • Date: 01 Feb 2013
  • Copyright: Photography appears courtesy of George Washington University.

At last year’s Joint Statistical Meeting in San Diego, Joseph Gastwirth, Professor of Statistics and Economics at George Washington University won the prestigious Gottfried E. Noether Senior Scholar Award for ‘outstanding contributions to the theory, applications, and teaching of nonparametric statistics’ (Amstat News, 1st October 2012).

Throughout his outstanding career, Professor Gastwirth has taught at many reputable institutions such as the Massachusetts Institute of Technology and Harvard University and was formerly a Visiting Advisor for the Office of Statistical Policy in the Executive Office of the President of the United States. Statistics Views talks to Professor Gastwirth about his work on nonparametric and robust methods, his book Statistical Science in the Courtroom, his current research on connecting art with statistics and working for then President Nixon.

An interview with the winner of the Gottfried E. Noether Young Scholar Award, Professor Guang Cheng, will follow next week. 

thumbnail image: “I hope that my most exciting work lies in the future”: An interview with Professor Joseph Gastwirth, winner of the Gottfried E. Noether Senior Scholar Award

1. Congratulations on winning the Gottfried E. Noether Senior Scholar Award for 2012. With the exceptional teaching career that you are continuing to enjoy, what was it in the first place that inspired you to pursue a career in statistics and mathematics? When and how did you first become aware of statistics as a discipline?

Math was my favourite subject in school and fortunately I was good at it. I guess I was pre-destined to work in an area with a heavy quantitative component. While every boy interested in sports learns about “statistics” like batting averages of ball players, only in college and graduate school did I learn about the discipline of statistics and its mathematical (probabilistic) foundations.

2. From 1971-1972, you were a Visiting Advisor for the Office of Statistical Policy in the Executive Office of the President. What did the role entail and did you get the opportunity to meet the then President Nixon?

In 1970 (approximately) the President appointed a Commission to review the Federal Statistical Operation. The late Prof. Fred Mosteller of Harvard served as Vice-Chair and suggested that the government bring in some academics for a one-year visit. I had done some research on measuring income inequality using Census data, so initially I was asked to look into projects relating to labour market statistics. At the time, the local area unemployment rates were being used to allocate five billion dollars to high unemployment areas for job-training. I noticed that part of one of the optional formulas had the effect of excluding minorities who worked in jobs that were not covered by Unemployment Insurance. So we developed a correction factor to adjust for this. I think it made that option more complex, so most places used a different one that was less biased.

During that year several new projects came up. As the regular staff had a full portfolio of tasks already, I was asked to work on two of them. One was the use of statistical data to assist in enforcing the equal employment laws. The other was estimating the number of “heroin addicts”. The second one started out in an odd way. In February 1972, President Nixon gave a TV speech in which he stated that the nation’s number one problem was heroin addiction. I actually had listened to it. Sometime the next day, the head of the Statistical Policy Office asked me to come by. It turned out that nobody could answer the following question: How many heroin addicts were there in the U.S. at that time? They formed an inter-agency panel and I was asked to assist the OMB representative, Robert Pearl from our office.

3. You continue to teach at George Washington University and have taught at many reputable institutions such as MIT and Harvard. As a university professor, what do you think the future of teaching statistics will be? What do you think will be the upcoming challenges in engaging students?

As P.T. Barnum said, predictions are very hard to make, especially about the future. The internet and other technological developments will continue to have an impact on university education. With the financial pressure on many Western governments, my guess is that the funds available for both teaching and research will grow at a much slower rate than in the period since WWII. One consequence may be that students will take more on-line classes from “master teachers” perhaps with a once-a-week discussion session (in smaller groups) with a faculty member at their university.

One challenge that is already happening is getting the students to focus on the lecture during class, rather than search the web with their iPhones and laptops for email etc. I heard that Harvard Law School is even thinking about banning laptops from class because their students are paying less attention in class.

4. In relation to the above question, do you think that statistics undergraduates and postgraduates starting out today are under more pressure to publish and to obtain grants than when you were a student yourself?

At major universities there was always pressure to publish but if one worked in area that was not supported by government granting agencies, as long as the papers appeared in respectable journals, one was OK. I think that new faculty are under more pressure to obtain grants. Indeed, some University Administrations practically equate research with sponsored research. As universities in the U.S.A. are non-profit, don’t pay taxes on their income or real estate (used for educational purposes) and receive tax-deductible contributions from alumni etc., I think this over-emphasis on money is inappropriate.

For example, suppose Albert Einstein was alive and active today. He would probably receive a grant supporting for the summer with money for one or two graduate students and perhaps a post-doc. Using money-raised by grants he would only be considered a modest producer. Now consider people running large medical clinical trials, which could also be run by private firms or some of the big survey operations where several thousand people are interviewed every month. These types of activities require a substantial staff, with a large salary budget. In the U.S. the universities receive overhead calculated as a percentage, e.g. 50%, of the salary budget. Also, several other research or survey teams could probably do the work almost as well. Intellectually, Einstein’s work is so much superior, it is ludicrous to compare the intellectual contributions of the typical head of a survey unit or large-scale clinical trial.

" An over-emphasis on bringing in funds has the potential to distort the focus of the next generation of scholars".

These large scale clinical trials and surveys are quite useful and provide important information. It is just that the cost to carry them out is many times that needed to support most theoretical or “thinking” projects. An over-emphasis on bringing in funds has the potential to distort the focus of the next generation of scholars.

5. Do you have any advice for students considering a university degree in statistics?

As a consequence of the rapid developments in computer technology, the field of statistics has changed dramatically over the years, so students should make sure they acquire a good background in statistical computing and are able to use at least two of the major statistical packages. Right now there is an emphasis on Big Data and data mining, so they will need to know how to handle large data sets. In the future more projects, whether academic research or work in government or industry, will be carried out by interdisciplinary teams and the curriculum may need to evolve.

6. You have contributed widely to statistical research applied to legal problems which led to your book, Statistical Science in the Courtroom (Springer, 2000). With a strong educational background in mathematics at Yale, Princeton and Columbia, what was it that led you to research law in terms of statistics and what were your aims in writing the book?

When I visited the government in 1971-72, one of my tasks was serving on a technical committee studying how the 1970 Census data, which was just becoming available, and other statistics produced by government agencies could help the Justice Department and Equal Employment Opportunity Commission target companies that were not in compliance with the law. A few years later a law firm, who obtained my name from one of the government agencies represented on the Committee, asked me to consult with them and possibly be their expert in a case. They represented a company defending a case. At first, I was not sure but I felt it would be interesting to look at the area from the point of view of the “other side” as my experience until then had been with the government agencies. Then I met a lawyer for the National District Attorney’s Association, who was concerned with pyramid fraud cases. The track record of the prosecutors in these cases had only been so-so. In part, the pyramid operator could usually produce someone who had been successful. I developed a probability model, which showed that there would be a small proportion of successful participants but the vast majority would lose money. I gave a talk at one of their meetings and then published an article on that model.

I realized that this area could generate a variety of problems in statistics and probability and Washington is one of the best places to be for contacts with leading firms, as well as having access to court records etc.

7. Over the years, how has your teaching, consulting, and research motivated and influenced each other? Do you continue to get research ideas from statistics and incorporate your ideas into your teaching? Where do you get inspiration for your research projects and books?

Of course, teaching, consulting and research are interconnected. A few years ago a former student of mine, who now is a partner at a major law firm, contacted me with regard to a securities law case. During our study and the interactions between the two parties, a variety of issues arose. We needed to adapt methodology that had been originally developed for biostatistics in order to measure the fairness of the allocations of shares the firm made. We also needed to think about how best to present some of the results. This led us to using robust estimators in the standard QQ plot to show that data on profits was not normally distributed. Subsequently, several of us explored the use of robust estimators in formal statistical tests of normality.

One advantage of the field of legal applications is that there are so many cases and regulatory hearings concerning almost every aspect of life that invariably many questions are raised. Often one can use current cases or public policy issues, especially those in the news, to motivate students in class as well as initiate research projects.

8. What has been the most exciting development that you have worked on in statistics during your career?

This is a very difficult question because I am excited about virtually all of the projects and hope that my most exciting work lies in the future. Two less well-known papers I really enjoyed working on connected a major side interest, art, with statistics. The auction houses only publish the prices of the works that sell. This creates a well-known selection bias. In a used book-store I came across the catalogues of two auctions, which contained the in-house information on “highest bids” for all the items. Dr. Binbing Yu and I noticed that including them raised the average percentage prediction error from about 30% to 40%. Then we replaced the assumed normally distributed random in Heckman’s selection model with the “fatter” tailed t-distribution with 2 df and showed that it provided better estimates of the highest bids for the usually unreported items than the usual selection models. The second project estimated the fraction of art offered on eBay that is said to be by Henry Moore and it actually is. It turned out that less than 10% of the small sculptures and drawings estimated were genuine, however, about 90% of the signed prints were. The questionable prints tended to be those that were also published in an unsigned edition. That work was actually mentioned in an article in the New York Times this past Labour Day weekend.

"I really enjoyed working on connected a major side interest, art, with statistics. .."

If you really pushed me to choose one topic for “most exciting”, I would probably say the work on efficiency robustness. My first paper in 1966 focused on finding one estimate or test statistic that would have high efficiency over a family of plausible underlying statistical models. In 1985, the choice of measure of the difference between the pass rates of minority and majority individuals, led me to see whether one can obtain a test that would have high power across several models. As the development of an efficiency robust procedure basically depends on the correlations between the optimal statistics for each model in a set of plausible ones, the approach is applicable to a wide variety of areas, e.g. choosing one non-parametric test among several for survival analysis or one set of scores for testing for a trend or an association in contingency tables. Recently, the approach has been used in genetic epidemiology where the underlying genetic model, recessive, additive or dominant is not known. With modern computational power, one can combine this approach with an adaptive one that eliminates some of the models as data accumulates.

9. What do you think the most important recent developments in the field have been? What do you think will be the most exciting and productive areas of research in statistics during the next few years?

The availability of high power computing is driving several important developments. It has enabled the bootstrap and related re-sampling techniques to be routinely used in the analysis of data. The distributions of many statistics can now be simulated, allowing for more complicated methods that provide greater insight to the underlying structure of the data or phenomena under investigation. Far more realistic prior distributions can now be implemented in Bayesian analyses of data, which may increase the acceptability of Bayesian methods. As you know in the U.K., courts have been somewhat reluctant to embrace Bayesian arguments.

"...distributions of many statistics can now be simulated, allowing for more complicated methods that provide greater insight to the underlying structure of the data or phenomena under investigation."

The “buzzword” at last year’s Joint Statistical Meetings in San Diego was “Big Data”. How should one make use of the huge amount of data collected by firms, major internet sites (Facebook, Google) and providers?

10. What do you see as the greatest challenges facing the profession of statistics in the coming years?

Oddly enough the readily available statistical packages may make users of statistics more confident in their ability to analyse data without the help of trained statisticians and potentially reduce job opportunities. Often they do not use the most appropriate method. Recently, Dr. B. Yu and I reanalysed data on peremptory challenges in death penalty cases that was submitted into evidence by two lawyers, who analysed several two by tables by the “paired t-test”. Trained statisticians would use the Cochran-Mantel-Haenszel test, which yielded a far more significant result.

It is becoming increasingly more difficult to separate parts of statistics from computer science, especially artificial intelligence, and other fields. We may see a more formal integration of several related disciplines that deal with collecting, analysing data and interpretation of the analysis in the future.

Another potential issue concerns the declining willingness of the public to respond to surveys. There has been a slight uptick in both the overall non-response rate and refusal rate in the U.S. Current Population Survey, although the overall response rate remains above 90%. When I first came to Washington in the early 1970’s, the government expected response rates to be at least 75% but I recently learned that they have weakened that requirement. For surveys sponsored by business or non-profit organizations, however, response rates above 75% are rare and some have response rates less than 50%. The possible causes, e.g. has the public just gotten tired of being bothered by interviewers or has lost confidence or trust in the government and business or is becoming increasingly unease with the loss of privacy will need to be addressed. A particular problem here is combining information from other sources with the responses given in a survey. With all the information about one readily available from the Web, it is difficult to be an “anonymous respondent” especially if government agency or firm carrying out the survey does not inform potential respondents that certain other information will be combined with the survey.

11. You have authored numerous publications. Is there a particular article that you are most proud of?

It would be difficult for me to pick just one article, so I hope I can mention a few. In terms of citations, my 1972 article on estimating the Lorenz curve and Gini index from grouped data in the Review of Economics and Statistics is probably the best known. From the view of statistical theory and methodology, my 1966 JASA paper on robust methods, which included a simple robust estimator and a method for obtaining efficiency robust tests and the related 1967 Annals paper with Professors Herman Chernoff and Vernon Johns on the distribution theory of a linear combinations of functions of order statistics along with investigations. I am also proud of the articles dealing with the effect of dependence on statistical procedures, first with Professor Herman Rubin and Steve Wolff and lately with Weiwen Miao and Yulia Gel in the context of data arising in legal cases.

12. Are there people or events that have been influential in your career?

Surely the values and encouragement one receives from their parents most influential. As a PhD student at Columbia I benefited immensely from my PhD advisor, Lajos Takacs and many discussions with Colin Mallows and general guidance from Professor Herbert Robbins. My post-doctoral year at Stanford brought me in contact with Herman Chernoff and Herbert Solomon, who encouraged my interest in nonparametric and robust methods. About five or six years later, I became interested in applications in economics; especially the measurement of inequality and was fortunate to be able to spend my first sabbatical leave at Harvard, the same year that Professor Fred Mosteller was serving as Vice-Chair of a Presidential Commission on Federal Statistics. He suggested that I spend a year in Washington working on economic statistics and I visited the Statistical Policy Division of the Office of Management and Budget (OMB). During that year I became exposed to the use of statistical data and analyses in the area of equal employment as well as other important applications to economics and public policy. Returning to academia I wrote an article based on some of the work I did for the task force on equal employment. Subsequently, some lawyers contacted me and I did some consulting work on various cases, always with an eye on using the questions involving statistics to motivate future work.

I also was asked to consult with OMB on a variety of issues; one of which involved the epidemiologic studies linking Reyes syndrome and aspirin use. I needed to learn a fair amount of epidemiology but had my colleague Sam Greenhouse, who had retired from the National Institute of Health and joined our Department here at George Washington to talk to. In 1985, I was fortunate to receive a Guggenheim Fellowship, which enabled me to devote an entire year to my book on Statistical Reasoning in Law and Public Policy.

I would be remiss if I failed to mention that much of the research I have carried out benefitted greatly by the dedicated efforts of many of my former PhD students, e.g. Boris Freidlin Mitchell Gail, Patricia Hammick, Abba Krieger, Maurice Owens, Marvin Podgor, Ed Rothman, Michael Sinclair, Woolcott Smith, Terry Smith, Nancy Spruill, Gang Zheng and Binbing Yu as well as colleagues and co-authors, e.g. Tapan Nayak, Zhaohai Li, Reza Modarres, Blaza Toman and Qing Pan at George Washington and P.K. Bhattacharya, Jane-ling Wang and Wes Johnson at UCDavis and Weiwen Miao and Yulia Gel who have worked with me on projects that arose from actual legal cases. One advantage of being a Professor is that one has the opportunity to interact with many very bright and energetic PhD students and younger faculty members.

Related Topics

Related Publications

Related Content

Site Footer

Address:

This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on StatisticsViews.com are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and StatisticsViews.com express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.