Every few days, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from the Canadian Journal of Statistics, with the full article now available to read here.
Guan, L. and Tibshirani, R. (2020), Post model‐fitting exploration via a “Next‐Door” analysis. Can J Statistics, 48: 447-470. doi:10.1002/cjs.11542
Next-Door analysis is a simple method for evaluating the model that has been chosen by a data-adaptive procedure. The chosen model is referred to as the base model. Next-door analysis deletes each selected predictor in the base model and reruns the procedure to get a set of models that are “close” to the base model, and compares the error rates of the base model with that of the nearby models. If the deletion of a predictor leads to significant deterioration in the model’s predictive power, the predictor is called indispensable; otherwise, the nearby model is called acceptable and can serve as a good alternative to the base model. Post model-fitting exploration is important nowadays when people often need to choose a good model based on the available data, and when the number of available features is moderately large, this usually involves model and feature selections for both good predictive performance and interpretability. Next-door analysis provides a new perspective and a flexible framework for doing post model-fitting exploration, and it provides both an assessment of the predictive contribution of each variable and a set of alternative models that may be used in place of the base model. We have implemented it in the R language as a library to accompany the well-known glmnet library.