Open Access: Practical strategies for generalized extreme value-based regression models for extremes

Every week, we select a recently published Open Access article to feature. This week’s article is from Environmetrics and proposes new generalized extreme value distributions for application to pollution levels in California.

The article’s abstract is given below, with the full article available to read here. 

Castro-Camilo, D.Huser, R., & Rue, H. (2022). Practical strategies for generalized extreme value-based regression models for extremesEnvironmetrics, e2742.

The generalized extreme value (GEV) distribution is the only possible limiting distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. As such, it has been widely applied to approximate the distribution of maxima over blocks. In these applications, GEV properties such as finite lower endpoint when the shape parameter $$ \xi $$ is positive or the loss of moments due to the magnitude of $$ \xi $$ are inherited by the finite-sample maxima distribution. The extent to which these properties are realistic for the data at hand has been widely ignored. Motivated by these overlooked consequences in a regression setting, we here make three contributions. First, we propose a blended GEV (bGEV) distribution, which smoothly combines the left tail of a Gumbel distribution (GEV with $$ \xi =0 $$) with the right tail of a Fréchet distribution (GEV with $$ \xi >0 $$). Our resulting distribution has, therefore, unbounded support. Second, we proposed a principled method called property-preserving penalized complexity (P$$ {}^3 $$C) prior to decide on the existence of the GEV distribution first and second moments a priori. Third, we propose a reparametrization of the GEV distribution that provides a more natural interpretation of the (possibly covariate-dependent) model parameters, which in turn helps define meaningful priors. We implement the bGEV distribution with the new parameterization and the P$$ {}^3 $$C prior approach in the R-INLA package to make it readily available to users. We illustrate our methods with a simulation study that reveals that the GEV and bGEV distributions are comparable when estimating the right tail under large-sample settings. Moreover, some small-sample settings show that the bGEV fit slightly outperforms the GEV fit. Finally, we conclude with an application to NO$$ {}_2 $$ pollution levels in California that illustrates the suitability of the new parameterization and the P$$ {}^3 $$C prior approach in the Bayesian framework.

More Details