Methods of covariate selection: directed acyclic graphs and the change-in-estimate procedure

Am J Epidemiol. 2009 May 15;169(10):1182-90. doi: 10.1093/aje/kwp035. Epub 2009 Apr 10.


Four covariate selection approaches were compared: a directed acyclic graph (DAG) full model and 3 DAG and change-in-estimate combined procedures. Twenty-five scenarios with case-control samples were generated from 10 simulated populations in order to address the performance of these covariate selection procedures in the presence of confounders of various strengths and under DAG misspecification with omission of confounders or inclusion of nonconfounders. Performance was evaluated by standard error, bias, square root of the mean-squared error, and 95% confidence interval coverage. In most scenarios, the DAG full model without further covariate selection performed as well as or better than the other procedures when the DAGs were correctly specified, as well as when confounders were omitted. Model reduction by using change-in-estimate procedures showed potential gains in precision when the DAGs included nonconfounders, but underestimation of regression-based standard error might cause reduction in 95% confidence interval coverage. For modeling binary outcomes in a case-control study, the authors recommend construction of a "conservative" DAG, determination of all potential confounders, and then change-in-estimate procedures to simplify this full model. The authors advocate that, under the conditions investigated, the selection of final model should be based on changes in precision: Adopt the reduced model if its standard error (derived from logistic regression) is substantially smaller; otherwise, the full DAG-based model is appropriate.

MeSH terms

  • Algorithms
  • Computer Graphics
  • Computer Simulation*
  • Confidence Intervals
  • Epidemiologic Methods*
  • Humans
  • Logistic Models
  • Models, Statistical
  • Multivariate Analysis*
  • Regression Analysis
  • Statistics as Topic