This paper addresses strategies for selecting variables for adjustment in non-experimental comparative effectiveness research and uses causal graphs to illustrate the causal network that relates treatment to outcome. Variables in the causal network take on multiple structural forms. Adjustment for a common cause pathway between treatment and outcome can remove confounding, whereas adjustment for other structural types may increase bias. For this reason, variable selection would ideally be based on an understanding of the causal network; however, the true causal network is rarely known. Therefore, we describe more practical variable selection approaches based on background knowledge when the causal structure is only partially known. These approaches include adjustment for all observed pretreatment variables thought to have some connection to the outcome, all known risk factors for the outcome, and all direct causes of the treatment or the outcome. Empirical approaches, such as forward and backward selection and automatic high-dimensional proxy adjustment, are also discussed. As there is a continuum between knowing and not knowing the causal, structural relations of variables, we recommend addressing variable selection in a practical way that involves a combination of background knowledge and empirical selection and that uses high-dimensional approaches. This empirical approach can be used to select from a set of a priori variables based on the researcher's knowledge to be included in the final analysis or to identify additional variables for consideration. This more limited use of empirically derived variables may reduce confounding while simultaneously reducing the risk of including variables that may increase bias.
Keywords: comparative effectiveness research; covariate selection; pharmacoepidemiology; propensity scores.
Copyright © 2013 John Wiley & Sons, Ltd.