Outcome modelling strategies in epidemiology: traditional methods and basic alternatives

Int J Epidemiol. 2016 Apr;45(2):565-75. doi: 10.1093/ije/dyw040. Epub 2016 Apr 20.

Abstract

Controlling for too many potential confounders can lead to or aggravate problems of data sparsity or multicollinearity, particularly when the number of covariates is large in relation to the study size. As a result, methods to reduce the number of modelled covariates are often deployed. We review several traditional modelling strategies, including stepwise regression and the 'change-in-estimate' (CIE) approach to deciding which potential confounders to include in an outcome-regression model for estimating effects of a targeted exposure. We discuss their shortcomings, and then provide some basic alternatives and refinements that do not require special macros or programming. Throughout, we assume the main goal is to derive the most accurate effect estimates obtainable from the data and commercial software. Allowing that most users must stay within standard software packages, this goal can be roughly approximated using basic methods to assess, and thereby minimize, mean squared error (MSE).

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical*
  • Epidemiologic Methods*
  • Humans
  • Models, Statistical*
  • Software