Comparing high-dimensional confounder control methods for rapid cohort studies from electronic health records

J Comp Eff Res. 2016 Mar;5(2):179-92. doi: 10.2217/cer.15.53. Epub 2015 Dec 4.


Aims: Electronic health records (EHR), containing rich clinical histories of large patient populations, can provide evidence for clinical decisions when evidence from trials and literature is absent. To enable such observational studies from EHR in real time, particularly in emergencies, rapid confounder control methods that can handle numerous variables and adjust for biases are imperative. This study compares the performance of 18 automatic confounder control methods.

Methods: Methods include propensity scores, direct adjustment by machine learning, similarity matching and resampling in two simulated and one real-world EHR datasets.

Results & conclusions: Direct adjustment by lasso regression and ensemble models involving multiple resamples have performance comparable to expert-based propensity scores and thus, may help provide real-time EHR-based evidence for timely clinical decisions.

Keywords: bias; clinical decision support; cohort studies; confounding; electronic health records; machine learning; propensity scores.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Cohort Studies
  • Confounding Factors, Epidemiologic*
  • Electronic Health Records / statistics & numerical data*
  • Humans
  • Machine Learning / statistics & numerical data
  • Propensity Score