# Layman’s abstract for Canadian Journal of Statistics article on a Estimation of design mean squared error of a small area mean model-based estimator under a nested error linear regression model

Each week, we publish layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

Stefan, M. and Hidiroglou, M.A. (2021), Estimation of design-based mean squared error of a small area mean model-based estimator under a nested error linear regression model. Can J Statistics. https://doi.org/10.1002/cjs.11622

An area or domain is a subset of a population from which the sample has been drawn. Small areas are subpopulations with small sample sizes. A challenge with small area estimation (SAE) is how to produce reliable estimates for characteristics of interest such as means, for areas or domains for which the realized sample size is very small or no samples are available for the domains of interest. SAE methods can be divided broadly into “design-based” and “model-based” methods. A traditional design-based (direct) estimate of a small area parameter is one that is based solely on the sample data associated with the corresponding small area. It follows that the lack of sample data from the target area affects seriously the accuracy of the direct estimate, and this has given rise to the development of new estimators to obtain more precise estimates of these parameters. They are based on statistical models that borrow information across areas to increase the efficiency and are referred to as model-based (indirect) estimators of small area parameters of interest.

The precision of these indirect estimators is measured using the models that they are based on. The resulting model mean squared error (mMSE) is traditionally used to measure the variability of the indirect estimators. On the other hand, National Statistical Offices (NSO) are often interested in estimating the design mean squared error (dMSE) which measures the variability of an estimator over all possible samples that could be selected from the target population under the sampling design used to select the sample. The dMSE is more appealing to a survey practitioner than the mMSE, as it does not rely on a model. Design estimators of the dMSE can be obtained but they tend to be unstable when the area sample size is small.

In our paper, we consider the basic unit level model and obtain a formula that approximates the dMSE of a model-based estimator of a small area mean. Using this new approximation to the dMSE, we propose a conditional model estimator of the dMSE (cmmse). The notable feature of the cmmse estimator is that it is self-contained: it is not a mix of an estimated model mean squared error and a design mean squared error. Furthermore, it yields in positive estimates. We report the results of simulation studies aimed to illustrate the performance of the cmmse estimator and compared it with other estimators of the dMSE proposed in the literature. All the dMSE estimators were evaluated in terms of design relative bias, absolute relative bias, relative root mean squared error, coverage rate and interval score of confidence intervals. Our cmmse estimator is shown to perform well under correct as well as incorrect modeling. For moderate small area sample sizes (5 units or larger) the coverage rate associated with the cmmse tends to the nominal level and its interval score values are close to their minimum associated with the true dMSE.