Each week, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from the Canadian Journal of Statistics, with the full article now available to read on Early View here.
So, H.Y., Thompson, M.E. and Wu, C. (2020), Correlated and misclassified binary observations in complex surveys. Can J Statistics. doi:10.1002/cjs.11551
In medical and health surveys, it is ubiquitous that binary responses are subject to misclassification. For example, respondents might report that they have coronary disease at the beginning of a study, but deny having any heart problem in the follow-up questionnaires. The discrepancy may be due to misdiagnosis, false memory or human mistakes in the recording process.
In longitudinal surveys, the primary interest lies in estimating the population changes over time at the individual levels. When responses are from the same individuals, the correlations between the observations can be substantial. Besides, the answers may be correlated within communities, such as the caseload of family doctors or the patients associated with various health centres. The diagnoses and treatments may vary across district areas and result in different response error.
Ignoring these problems would make the population prevalence estimates inaccurate. Therefore, health policies based on them may be inadequate for the needs of the population.
This article proposes a pseudo – generalized estimating equation (pseudo – GEE) approach to analyze binary survey responses with misclassifications. The method is particularly useful under cluster sampling frameworks or longitudinal surveys, and it can estimate the misclassification and response parameters at the same time. This method is assessed numerically using a real dataset from the Canadian Longitudinal Study on Aging (CLSA).