Design of an extensive information representation scheme for clinical narratives

J Biomed Semantics. 2017 Sep 11;8(1):37. doi: 10.1186/s13326-017-0135-z.


Background: Knowledge representation frameworks are essential to the understanding of complex biomedical processes, and to the analysis of biomedical texts that describe them. Combined with natural language processing (NLP), they have the potential to contribute to retrospective studies by unlocking important phenotyping information contained in the narrative content of electronic health records (EHRs). This work aims to develop an extensive information representation scheme for clinical information contained in EHR narratives, and to support secondary use of EHR narrative data to answer clinical questions.

Methods: We review recent work that proposed information representation schemes and applied them to the analysis of clinical narratives. We then propose a unifying scheme that supports the extraction of information to address a large variety of clinical questions.

Results: We devised a new information representation scheme for clinical narratives that comprises 13 entities, 11 attributes and 37 relations. The associated annotation guidelines can be used to consistently apply the scheme to clinical narratives and are .

Conclusion: The information scheme includes many elements of the major schemes described in the clinical natural language processing literature, as well as a uniquely detailed set of relations.

Keywords: Clinical natural language processing; Knowledge representation.

MeSH terms

  • Biological Ontologies*
  • Data Mining / methods*
  • Electronic Health Records*
  • Humans
  • Natural Language Processing*