A Robust e-Epidemiology Tool in Phenotyping Heart Failure with Differentiation for Preserved and Reduced Ejection Fraction: the Electronic Medical Records and Genomics (eMERGE) Network

J Cardiovasc Transl Res. 2015 Nov;8(8):475-83. doi: 10.1007/s12265-015-9644-2. Epub 2015 Jul 21.


Identifying populations of heart failure (HF) patients is paramount to research efforts aimed at developing strategies to effectively reduce the burden of this disease. The use of electronic medical record (EMR) data for this purpose is challenging given the syndromic nature of HF and the need to distinguish HF with preserved or reduced ejection fraction. Using a gold standard cohort of manually abstracted cases, an EMR-driven phenotype algorithm based on structured and unstructured data was developed to identify all the cases. The resulting algorithm was executed in two cohorts from the Electronic Medical Records and Genomics (eMERGE) Network with a positive predictive value of >95 %. The algorithm was expanded to include three hierarchical definitions of HF (i.e., definite, probable, possible) based on the degree of confidence of the classification to capture HF cases in a whole population whereby increasing the algorithm utility for use in e-Epidemiologic research.

Keywords: Electronic medical records; Heart failure; Natural language processing; Phenotyping; Ventricular ejection fraction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Data Mining / methods*
  • Electronic Health Records*
  • Female
  • Heart Failure / classification
  • Heart Failure / diagnosis*
  • Heart Failure / epidemiology
  • Heart Failure / physiopathology
  • Humans
  • Male
  • Natural Language Processing*
  • Phenotype
  • Reproducibility of Results
  • Stroke Volume*
  • United States / epidemiology
  • Ventricular Function, Left*