Mining Electronic Health Records to Extract Patient-Centered Outcomes Following Prostate Cancer Treatment

AMIA Annu Symp Proc. 2018 Apr 16;2017:876-882. eCollection 2017.


The clinical, granular data in electronic health record (EHR) systems provide opportunities to improve patient care using informatics retrieval methods. However, it is well known that many methodological obstacles exist in accessing data within EHRs. In particular, clinical notes routinely stored in EHR are composed from narrative, highly unstructured and heterogeneous biomedical text. This inherent complexity hinders the ability to perform automated large-scale medical knowledge extraction tasks without the use of computational linguistics methods. The aim of this work was to develop and validate a Natural Language Processing (NLP) pipeline to detect important patient-centered outcomes (PCOs) as interpreted and documented by clinicians in their dictated notes for male patients receiving treatment for localized prostate cancer at an academic medical center.

MeSH terms

  • Aged
  • Algorithms
  • Data Mining / methods*
  • Electronic Health Records*
  • Erectile Dysfunction
  • Humans
  • Male
  • Medical Records Systems, Computerized
  • Middle Aged
  • Natural Language Processing*
  • Patient Outcome Assessment*
  • Postoperative Complications*
  • Prostatic Neoplasms / surgery*
  • Urinary Incontinence