Robustness in statistical modelling

Author: Dr Matthew Schofield

Every statistical model has underlying assumptions. Statistical methodology is developed and understood by treating the model as true. The problem is that no model is ever correct. The assumptions underlying any statistical procedure will not be satisfied by real data. The idea behind robust statistics is that we want our methodology to be insensitive to model misspecification. A robust procedure is one that is still useful even when the assumptions are violated.

Robust approaches have traditionally focused on the effect of outliers. If a model assumes the observations are normally distributed, we only expect to see values more than four standard deviations from the mean once every 15,000 observations. An observation that is many standard deviations from the mean can have a dramatic effect on standard statistical approaches. Suppose we had collected data regarding the income of 1000 residents in the Seattle metropolitan area in an effort to estimate the mean household income. If Bill Gates happened to get chosen in our sample, he would excessively influence our results. If rumors are true and he earns $10 billion a year, then irrespective of incomes of the other 999 individuals, the sample mean can be no smaller than $10 million, an estimate that is clearly ridiculous. Robust estimators lead to more reasonable estimates. One such estimator is the sample median, which is unaffected by the most extreme observations in our sample.

thumbnail image: Robustness in statistical modelling

This example shows the power of robust procedures, but also reveals some limitations. We have done nothing to fix the model misspecification, the root cause of the problem. Instead we have specified an alternate estimator designed to perform well in the presence of outliers. For more complex statistical models, like those commonly used in ecology and the environmental sciences, there are no estimation procedures that can be widely used to ensure robustness. One solution is to develop statistical models that better describe the underlying data, avoiding the problems that come from model misspecification. In the case of outliers, heavy tailed or skewed distributions could be considered in place of a normal distribution.

As models increase in complexity, the number of assumptions that could be violated increase. Our ability to specify and fit such models has improved faster than our ability to assess the underlying assumptions and understand the implications of model misspecification. This is particularly troubling when we consider situations where model assessment is difficult due to a lack of information leading to the model appearing consistent with the data irrespective of the true data generating process. To assess robustness in such situations we can consider specific violations of the model assumptions and investigate the how prediction of key quantities are affected.

We have recently explored aspects of climate reconstruction using tree-ring data. The idea is that tree-ring growth measurements can be used as a long-term imperfect temperature gauge for instrumental temperature readings that are only available for the last 100 years. An important part of the statistical model is the relationship between the growth measurements and the instrumental readings. We found that the estimation of the historical temperature was not robust with regard to specification of this relationship. The data fit several competing models equally as well, despite them leading to different predictions of historical temperature. The inference appears to be sensitive to assumptions that are unable to be adequately assessed with the data on hand.

Another example is the use of repeated counts to estimate the abundance of animal populations. N-mixture modeling is an increasing popular approach that assumes repeated counts of animals in a particular location are random draws from a binomial distribution where both the trial size and success probability are unknown. The trial sizes are assumed to reflect the underlying abundance of the animal at the site and are themselves assumed to be random draws from a Poisson distribution. In place of the binomial assumption, we allowed the data to be realizations from over- and under-dispersed Poisson models. We found that the prediction of site-specific abundance was not robust to these model changes as it led to major differences in the predicted abundances despite the different models appeared equally plausible.

The lack of robustness evident these examples is not related to outliers. It is a consequence of other modeling assumptions that have been violated. To make the models more robust, we need to be able to identify which sets of assumptions are supported by the data and which are not. One way we can do this is to obtain separate, high quality data that can help us determine between different choices. For the N-mixture model, we could mark or tag some of the animals and use mark-recapture models to provide additional information on the success probability. Unfortunately, there is no obvious solution for the climate reconstruction example. Without additional guidance from science as to the nature of the relationship between ring growth and temperature, we know of no way to make the model more robust.

The news is not all bad. Along with the parameters that were not robust, in each of the examples there were other aspects that were relatively robust. In the climate reconstruction example, we found that the time periods when the temperature was either warming or cooling were comparatively robust to changes in assumptions, despite the magnitude of these values changing considerably. Likewise for the N-mixture model, the relative abundance between two sites was comparatively robust (e.g. that site A has estimated abundance twice that of site B), despite the sensitivity in estimates of absolute abundance. We believe that these examples are typical of many complex modeling situations; there are some parameters that are sensitive to assumptions and others that are relatively robust. For such models to be useful in practice, it is important that practitioners understand which parameters can be reliably inferred and which are more sensitive to the choices made. Assessing model fit and evaluating robustness is more important now than ever in an age of increasing model complexity.

Dr Matthew Schofield is a Senior Lecturer in Statistics at University of Otago, New Zealand.

 

Copyright: Image appears courtesy of Getty Images