An evaluation of the UMLS in representing corpus derived clinical concepts

AMIA Annu Symp Proc. 2011:2011:435-44. Epub 2011 Oct 22.

Abstract

We performed an evaluation of the Unified Medical Language System (UMLS) in representing concepts derived from medical narrative documents from three domains: chest x-ray reports, discharge summaries and admission notes. We detected concepts in these documents by identifying noun phrases (NPs) and N-grams, including unigrams (single words), bigrams (word pairs) and trigrams (word triples). After removing NPs and N-grams that did not represent discrete clinical concepts, we processed the remaining with the UMLS MetaMap program. We manually reviewed the results of MetaMap processing to determine whether MetaMap found full, partial or no representation of the concept. For full representations, we determined whether post-coordination was required. Our results showed that a large portion of concepts found in clinical narrative documents are either unrepresented or poorly represented in the current version of the UMLS Metathesaurus and that post-coordination was often required in order to fully represent a concept.

Publication types

  • Evaluation Study

MeSH terms

  • Electronic Health Records*
  • Humans
  • Natural Language Processing*
  • Patient Admission
  • Patient Discharge
  • Radiography, Thoracic
  • Radiology Information Systems*
  • Unified Medical Language System*