Layman’s abstract for paper on doubly sparse regression incorporating graphical structure among predictors

Every few days, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

The article featured today is from the Canadian Journal of Statistics and the full article, published in issue 47.4, is available to read online here.

Stephenson, M. , Ali, R. A., Darlington, G. A. and (2019), Doubly sparse regression incorporating graphical structure among predictors. Can J Statistics, 47: 729-747. doi: 10.1002/cjs.11521

When biological data is associated with a complex system, then it is advantageous to exploit the structure of the system to predict the biological response. We present a novel approach, doubly sparse regression incorporating graphical structure (DSRIG), that first models the underlying structure of the variables in the system, and then leverages this structural information to improve prediction of the response. This model is highly flexible and has excellent predictive abilities compared to other methods, particularly when only a fraction of the variables are related to the response. It can help identify relevant predictors as well as potential predictors that may warrant further scrutiny for understanding the response. DSRIG is valid in: (i) high-dimensional settings, where the number of variables exceeds the sample size; and (ii) settings in which some predictors are highly correlated with one another. Further, it is proven that DSRIG has desirable theoretical properties. This new model is demonstrated on two data sets, one of which models cognitive function using volumes of regions of the brain. We identify a list of candidate regions that may be associated with decreased cognitive functioning as is seen in the progression of Alzheimer’s disease. MATLAB code to fit DSRIG to data is available for public use.