Each week, we publish lay abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from Statistics in Medicine with the full article now available to read here.
Federated causal inference in heterogeneous observational data. Statistics in Medicine. 2023; 1–22. doi: 10.1002/sim.9868
, , , , , .Suppose that in a health crisis or pandemic, a researcher wants to study the effect of existing drugs on preventing a new disease. Evidence that a drug appears to have protective effects might be considered when prioritizing promising candidates for further study or randomized trials.
The researcher has access to confidential patient records distributed across multiple hospitals, where patient-level data cannot be easily shared across hospitals. One approach to study the effect is to analyze whether, among the patients with the new disease, those already taking these drugs would have better outcomes than those who do not after admission to the hospital.
Two challenges arise. An initial challenge is that patients taking a drug will be different from those who do not when using observational data. Thus, it will be important to use the statistical methods that compare patients who face similar health risks from the new disease. This can be achieved by adjusting for patient characteristics in the statistical methods. However, adjusting for patient characteristics can be complicated due to a second challenge: no hospital has enough data to learn how patient characteristics affect outcomes on its own, while the data cannot be pooled across hospitals.
To address these two challenges, the authors propose federated methods to efficiently estimate treatment effects from multiple confidential databases. These methods only request summary statistics from each database in a single shot, without the need of iteratively retrieving information from individual databases. The estimated treatment effects from the federated methods are doubly robust, with variance equivalent to what would be possible if the data sets were pooled. The variance is also estimated from the federated methods to allow for inference of treatment effects.
The authors further show that, to ensure valid inference, it is crucial for the estimation procedures in the federated methods to vary with conditions that describe the stability of models across databases. Finally, the authors present a case study evaluating the protective impact of alpha-blockers, a drug commonly prescribed to treat prostate conditions, on outcomes for patients admitted to the hospital with respiratory illness.
More Details