Lay abstract for Statistics in Medicine article: A new mixed‐effects mixture model for constrained longitudinal data


Every few days, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

The article featured today is from Statistics in Medicine and the full article, published in issue 39.2, is available to read online here.

Di Brisco, AM, Migliorati, S. A new mixed‐effects mixture model for constrained longitudinal data. Statistics in Medicine. 2020; 39: 129– 145. doi: 10.1002/sim.8406

Biomedical research often features continuous outcomes/responses bounded by the interval [0,1], e.g. proportions or rates. The well-known beta regression model addresses the constrained nature of these data, by assuming that the outcome is beta distributed. Moreover, its augmented variant can address the presence of zeros and/or ones (e.g. completely unhealthy or completely healthy cases), while its mixed-effects version allows to model longitudinal or clustered response values. However, these models are not robust to the presence of outliers and/or excessive number of observations near the tails and cannot even handle bimodal responses. The latter issue often arises when a qualitative covariate is omitted, and the values of this covariate identify clusters or groups (e.g. males vs females, etc.). In this paper a new augmented mixed-effects regression model is proposed. This is based on the flexible Dirichlet (FB) distribution, which is a special beta mixture that is capable of handling all these issues. In particular, the FB is characterized by a natural order between the means of the two mixture components (one lower and one greater), which share a common precision parameter instead. Extensive simulation studies show the superiority of the proposed model to the models most often used in the literature. The proposed model is applied to two real datasets. The first is taken from a long-term study of Parkinson’s disease, which needs to handle within-subject correlations. The second investigates the impact of dyslexia and general cognitive ability on reading skills, and has prompted the exploration of an augmented version of the FB distribution. All inferential issues are addressed within a Bayesian approach.