Controlling for continuous confounders in epidemiologic research

Epidemiology. 1997 Jul;8(4):429-34.


Multiple regression models are commonly used to control for confounding in epidemiologic research. Parametric regression models, such as multiple logistic regression, are powerful tools to control for multiple covariates provided that the covariate-risk associations are correctly specified. Residual confounding may result, however, from inappropriate specification of the confounder-risk association. In this paper, we illustrate the order of magnitude of residual confounding that may occur with traditional approaches to control for continuous confounders in multiple logistic regression, such as inclusion of a single linear term or categorization of the confounder, under a variety of assumptions on the confounder-risk association. We show that inclusion of the confounder as a single linear term often provides satisfactory control for confounding even in situations in which the model assumptions are clearly violated. In contrast, categorization of the confounder may often lead to serious residual confounding if the number of categories is small. Alternative strategies to control for confounding, such as polynomial regression or linear spline regression, are a useful supplement to the more traditional approaches.

MeSH terms

  • Bias
  • Cohort Studies*
  • Computer Simulation*
  • Confounding Factors, Epidemiologic*
  • Humans
  • Logistic Models
  • Models, Statistical*
  • Odds Ratio
  • Regression Analysis*
  • Risk Assessment