Each week, we publish lay abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from the Canadian Journal of Statistics with the full article now available to read here.
Tsao, M. (2023), Regression model selection via log-likelihood ratio and constrained minimum criterion. Can J Statistics. https://doi.org/10.1002/cjs.11756
This paper develops a new method of variable selection for linear, generalized linear, and other types of regression models. The method is based on the following geometric argument. In the unknown true parameter vector of the full model containing all variables, elements corresponding to active variables are non-zero, and elements corresponding to inactive variables are all zero. To find the inactive variables is to identify those elements of the parameter vector that are zero.
The likelihood ratio confidence region for the parameter vector of the full model is centred at the maximum likelihood estimator. As the sample size goes to infinity, the confidence region shrinks in size and degenerates into the maximum likelihood estimator which converges to the true parameter vector. It follows that the whole confidence region converges to the true parameter vector, so for sufficiently large sample sizes, elements of vectors in the confidence region corresponding to active variables are all non-zero. This implies that when the confidence region captures the true parameter vector, it will be a vector in the region having the most zeros in its elements. Because of this, the method selects a vector with the most zeros in the region as an estimate of the unknown true parameter vector, thereby classifying those variables corresponding to zeros of the chosen vector as inactive variables.
This method provides a frequentist perspective to the variable/model selection problem. It is an alternative and a strong competitor to the commonly used Akaike Information Criterion and Bayesian Information Criterion for selecting regression models.
More Details