Semiparametric maximum likelihood estimation under not missing at random


  • Author: Kosuke Morikawa, Jae Kwang Kim and Yutaka Kano
  • Date: 07 December 2018
  • Copyright: Image copyright of Patrick Rhodes

Nonresponse is frequently encountered in empirical studies. When the response mechanism is missing not at random (MNAR) statistical inference using the observed data is quite challenging. Handling MNAR data often requires two model assumptions: one for the outcome and the other for the response propensity. Correctly specifying these two model assumptions is challenging and difficult to verify from the responses obtained. In a recent article published in The Canadian Journal of Statistics, the authors propose a semiparametric maximum likelihood method for MNAR data in the sense that a parametric assumption is used for the response propensity part of the model and a nonparametric model is used for the outcome part.

Their paper can be read in full via this link and the authors explain their findings below:

Semiparametric maximum likelihood estimation with data missing not at random

Kosuke Morikawa, Jae Kwang Kim and Yutaka Kano

The Canadian Journal of Statistics, Volume 45, Issue 4, December 2017, pages 393-409

thumbnail image: Semiparametric maximum likelihood estimation under not missing at random

Missing data have become a major problem in statistical inference. Naive analysis such as listwise deletion for missing data may lead to a distorted result. When the response mechanism can be explained by observed data only, it is called missing at random (MAR), otherwise, it is referred to as missing not at random (MNAR). For missing data with MAR mechanism, appropriate statistical methods have been well established. However, it is challenging to analyze MNAR data because the model depends on variables being not observed. Conducting maximum likelihood estimation under MNAR mechanism often requires strong model assumptions.

The authors proposed a semiparametric maximum likelihood estimation for the response model parameters. Their method does not require parametric model assumptions on the outcome variable. Furthermore, the proposed estimator is asymptotically more efficient than the other previously proposed semiparametric estimators because it is based on the method of maximum likelihood. In the paper, statistical properties of the proposed estimator, results for a simulation study, and an application to the Korea Labor and Income Panel Survey data are presented.

Related Topics

Related Publications

Related Content

Site Footer


This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.