Inference about the expected performance of a data-driven dynamic treatment regime

Clin Trials. 2014 Aug;11(4):408-417. doi: 10.1177/1740774514537727. Epub 2014 Jun 12.

Abstract

Background: A dynamic treatment regime (DTR) comprises a sequence of decision rules, one per stage of intervention, that recommends how to individualize treatment to patients based on evolving treatment and covariate history. These regimes are useful for managing chronic disorders, and fit into the larger paradigm of personalized medicine. The Value of a DTR is the expected outcome when the DTR is used to assign treatments to a population of interest.

Purpose: The Value of a data-driven DTR, estimated using data from a Sequential Multiple Assignment Randomized Trial, is both a data-dependent parameter and a non-smooth function of the underlying generative distribution. These features introduce additional variability that is not accounted for by standard methods for conducting statistical inference, for example, the bootstrap or normal approximations, if applied without adjustment. Our purpose is to provide a feasible method for constructing valid confidence intervals (CIs) for this quantity of practical interest.

Methods: We propose a conceptually simple and computationally feasible method for constructing valid CIs for the Value of an estimated DTR based on subsampling. The method is self-tuning by virtue of an approach called the double bootstrap. We demonstrate the proposed method using a series of simulated experiments.

Results: The proposed method offers considerable improvement in terms of coverage rates of the CIs over the standard bootstrap approach.

Limitations: In this article, we have restricted our attention to Q-learning for estimating the optimal DTR. However, other methods can be employed for this purpose; to keep the discussion focused, we have not explored these alternatives.

Conclusion: Subsampling-based CIs provide much better performance compared to standard bootstrap for the Value of an estimated DTR.