Layman’s abstract for paper on accounting for established predictors with the multistep elastic net

Every few days, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

The article featured today is from Statistics in Medicine and the full article, published in issue 38.23, is available to read online here.

Chase, EC, Boonstra, PS. Accounting for established predictors with the multistep elastic net. Statistics in Medicine. 2019; 38: 4534– 4544. doi: 10.1002/sim.8313

When a dataset has numerous predictors and comparatively small sample size, it is important to use all available information in and about the data. However, many researchers overlook a common source of information: prior knowledge about some of the predictors. Although many predictors may be novel and unexplored, studies rarely start from a completely blank slate. Based on previous research, investigators usually have some idea of covariates (race, sex, age, smoking history, etc.) that are likely to be associated with the outcome of interest. Models that incorporate the knowledge that these covariates are already tested may be able to offer additional efficiency in estimation and prediction. To that end, Chase and Boonstra present the multi-step elastic net (MSN), a penalized regression approach that classifies predictors into two broad classes, established (previously tested and found significant) vs. unestablished (untested and novel), and then uses cross-validation to select an optimal relative penalty for the two classes of predictors. In simulation studies, the MSN achieved comparable or superior predictive and estimation performance compared to other common approaches for this problem. For the applied statistician, the MSN offers a simple heuristic to incorporate prior knowledge in analyses without requiring strong assumptions or detailed prior information about the covariates.