Canadian Journal of Statistics

Q‐learning for estimating optimal dynamic treatment rules from observational data

Journal Article

Abstract

The area of dynamic treatment regimes (DTR) aims to make inference about adaptive, multistage decision‐making in clinical practice. A DTR is a set of decision rules, one per interval of treatment, where each decision is a function of treatment and covariate history that returns a recommended treatment. Q‐learning is a popular method from the reinforcement learning literature that has recently been applied to estimate DTRs. While, in principle, Q‐learning can be used for both randomized and observational data, the focus in the literature thus far has been exclusively on the randomized treatment setting. We extend the method to incorporate measured confounding covariates, using direct adjustment and a variety of propensity score approaches. The methods are examined under various settings including non‐regular scenarios. We illustrate the methods in examining the effect of breastfeeding on vocabulary testing, based on data from the Promotion of Breastfeeding Intervention Trial. The Canadian Journal of Statistics 40: 629–645; 2012 © 2012 Statistical Society of Canada

Related Topics

Related Publications

Related Content

Site Footer

Address:

This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on StatisticsViews.com are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and StatisticsViews.com express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.