Most epidemiology textbooks that discuss models are vague on details of model selection. This lack of detail may be understandable since selection should be strongly influenced by features of the particular study, including contextual (prior) information about covariates that may confound, modify, or mediate the effect under study. It is thus important that authors document their modeling goals and strategies and understand the contextual interpretation of model parameters and model selection criteria. To illustrate this point, we review several established strategies for selecting model covariates, describe their shortcomings, and point to refinements, assuming that the main goal is to derive the most accurate effect estimates obtainable from the data and available resources. This goal shifts the focus to prediction of exposure or potential outcomes (or both) to adjust for confounding; it thus differs from the goal of ordinary statistical modeling, which is to passively predict outcomes. Nonetheless, methods and software for passive prediction can be used for causal inference as well, provided that the target parameters are shifted appropriately.
Keywords: causal inference; confounding; modeling; variable selection.