The effect of unmeasured confounders on the ability to estimate a true performance or selection gradient (and other partial regression coefficients)

Jeffrey A Walker

doi:10.1111/evo.12406

The effect of unmeasured confounders on the ability to estimate a true performance or selection gradient (and other partial regression coefficients)

Evolution. 2014 Jul;68(7):2128-36. doi: 10.1111/evo.12406. Epub 2014 Apr 16.

Author

Jeffrey A Walker¹

Affiliation

¹ Department of Biological Sciences, University of Southern Maine, Portland, Maine, 04103. walker@maine.edu.

PMID: 24635123
DOI: 10.1111/evo.12406

Abstract

Multiple regression of observational data is frequently used to infer causal effects. Partial regression coefficients are biased estimates of causal effects if unmeasured confounders are not in the regression model. The sensitivity of partial regression coefficients to omitted confounders is investigated with a Monte-Carlo simulation. A subset of causal traits is "measured" and their effects are estimated using ordinary least squares regression and compared to their expected values. Three major results are: (1) the error due to confounding is much larger than that due to sampling, especially with large samples, (2) confounding error shrinks trivially with sample size, and (3) small true effects are frequently estimated as large effects. Consequently, confidence intervals from regression are poor guides to the true intervals, especially with large sample sizes. The addition of a confounder to the model improves estimates only 55% of the time. Results are improved with complete knowledge of the rank order of causal effects but even with this omniscience, measured intervals are poor proxies for true intervals if there are many unmeasured confounders. The results suggest that only under very limited conditions can we have much confidence in the magnitude of partial regression coefficients as estimates of causal effects.

Keywords: Effect size; Monte-Carlo; multiple regression; omitted variable bias; sensitivity analysis; simulation.

MeSH terms

Confounding Factors, Epidemiologic
Models, Genetic*
Monte Carlo Method
Selection, Genetic*