Background: Although the superior internal validity of the randomized clinical trial (RCT) is invaluable to establish causality, generalizability is far from guaranteed. In particular, strict selection criteria intended to maximize treatment efficacy and safety can impair external validity. This problem is widely acknowledged in principle but sometimes ignored in practice, with considerable consequences for treatment options.
Purpose: We demonstrate how selection of patients for an RCT can bias the results when the treatment effect varies across individuals. Indeed, not only the magnitude, but even the direction of the causal effect found in an RCT can differ from the causal effect in the target population.
Methods: A counterfactual model is developed to represent the selection process explicitly. This simple extension of the standard counterfactual model is used to explore the implications of restrictive exclusion criteria intended to eliminate high-risk individuals. The counterintuitive findings of a recent FDA meta-analysis of suicidality in pediatric populations treated with antidepressant medications are interpreted in the light of this counterfactual model.
Results: When the causal effect of an intervention can vary across individuals, the potential for selection bias (in the sense of a threat to external validity) can be serious. In particular, we demonstrate that the stricter the inclusion/exclusion criteria the greater the potential inflation of relative risk. A critical factor in determining bias is the extent to which individuals with differing types of causal effects can be distinguished prior to sampling. Furthermore, we propose methods that can sometimes be useful to identify the existence of bias in an actual study. When applied to the FDA meta-analysis of pediatric suicidality in RCTs of modern antidepressant medications, these methods suggest that the elevated risk observed may be an artifact of selection bias.
Limitations: Real-life scenarios are generally more complex than the counterfactual model presented here. Future modeling efforts are needed to refine and extend our approach.
Conclusions: When variation of treatment effects across individuals is plausible, lack of generalizability should be a serious concern. Therefore, external validity of RCTs needs to be carefully considered in the design of an RCT and the interpretation of its results, especially when the study can influence regulatory decisions about drug safety. RCTs should not automatically be considered definitive, especially when their results conflict with those of observational studies. Whenever possible, empirical evidence of bias resulting from sample selection should be obtained and taken into account.