Model selection and health effect estimation in environmental epidemiology

Epidemiology. 2008 Jul;19(4):558-60. doi: 10.1097/EDE.0b013e31817307dc.

Abstract

In air pollution epidemiology, improvements in statistical analysis tools can help improve signal-to-noise ratios, and untangle large correlations between exposures and confounders. For this reason, we welcome a novel model-selection approach that helps to identify the time-windows of exposure to pollutants that produces adverse health effects. However, there are concerns about approaches that select a model based on a given data set, and then estimate health effects in the same data. This can create problems when (1) the sample size is small in relation to the magnitude of the health effects; and (2) candidate predictors are highly correlated and likely to have similar effects. Bayesian Model Averaging has been advocated as a way to estimate health effects that accounts for model uncertainty. However, implementations where posterior model probabilities are approximated using BIC, as well as other default choices, may not reflect the ability of each model to provide an estimate of the health effect that is properly adjusted for confounding. Air pollution studies need to focus on estimating health effects while accounting for the uncertainty in the adjustment for confounding factors. This is true especially when model choice and estimation are performed on the same data. The development of appropriate statistical tools remains an open area of investigation.

Publication types

  • Comment
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Air Pollutants / adverse effects*
  • Bayes Theorem
  • Data Interpretation, Statistical
  • Epidemiology / standards*
  • Humans
  • Models, Statistical*
  • Research / standards*

Substances

  • Air Pollutants