Literature-Based Discovery of Confounding in Observational Clinical Data

AMIA Annu Symp Proc. 2017 Feb 10:2016:1920-1929. eCollection 2016.

Abstract

Observational data recorded in the Electronic Health Record (EHR) can help us better understand the effects of therapeutic agents in routine clinical practice. As such data were not collected for research purposes, their reuse for research must compensate for additional information that may bias analyses and lead to faulty conclusions. Confounding is present when factors aside from the given predictor(s) affect the response of interest. However, these additional factors may not be known at the outset. In this paper, we present a scalable literature-based confounding variable discovery method for biomedical research applications with pharmacovigilance as our use case. We hypothesized that statistical models, adjusted with literature-derived confounders, will more accurately identify causative drug-adverse drug event (ADE) relationships. We evaluated our method with a curated reference standard, and found a pattern of improved performance ~ 5% in two out of three models for gastrointestinal bleeding (pre-adjusted Area Under Curve ≥ 0.6).

MeSH terms

  • Area Under Curve
  • Biomedical Research
  • Confounding Factors, Epidemiologic*
  • Drug-Related Side Effects and Adverse Reactions
  • Electronic Health Records*
  • Humans
  • Models, Theoretical
  • Pharmacovigilance*