Layman’s abstract: Nested case-control study designs for left-truncated survival data

Every Friday on Statistics Views, we publish layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format. This article featured today is from The Canadian Journal of Statistics: Nested case-control study designs for left-truncated survival data by Ana F. Best and David B. Wolfson.

Read the layman’s abstract below.

Ana F. Best, David B. Wolfson. (2017), Nested case-control study designs for left-truncated survival data, The Canadian Journal of Statistics, 45, pages 4-28, doi: 10.1002/cjs.11311

The study of the natural history of diseases allows us to, among other things, better assess disease screening programs, plan drug trials, and anticipate disease burden from a public health perspective. Of frequent importance in the natural history is the time interval between disease onset and a subsequent well-defined state, for example death from any cause. Often, this so-called survival interval is suspected of being associated with various risk factors and studies are conducted to identify which of these are truly associated with survival; Increasingly, the aim is to associate genetic profiles and biochemical markers with survival. While the collection of biological specimens may be relatively easy and inexpensive (for example it is easy to obtain a mucous swab), the laboratory analyses of these samples could require special expertise and be prohibitively expensive.

The prevalent cohort study with follow-up (PCSWF) is sometimes used to find risk factors for survival for diseases that are relatively rare and/or have long duration. Study participants with existing (prevalent) disease in a large sample are identified over a short time period and followed until death or the end of the study. The disease durations of these participants are known to be left-truncated. They are “on average” longer than the durations that would be observed by following an initially disease-free cohort through the sequence of states, disease-free, disease , and death, as participants with prevalent disease, by definition, must survive long enough to be included in a PCSWF. This bias must be accommodated in any statistical analysis.

In this research it is shown that in a PCSWF the loss in statistical efficiency may not be great if full risk factor information (including information on genetic and biochemical markers) is obtained on only a small proportion of those being followed. More precisely, this information need only be ascertained on deceased participants and a set of non-deceased participants obtained by randomly sampling roughly four people from the surviving cohort at each instant a death is observed. Other design issues are also discussed. Sampling from the risk sets at each failure time has been used widely for the investigation of risk factors for disease onset, and the states are disease-free and disease. Whether in this context or in the context of a prevalent cohort study with follow-up, such studies are called nested case-control studies, because risk set sampling can be interpreted as the sampling of controls each time a case is observed.