Natural language processing to extract medical problems from electronic clinical documents: performance evaluation

J Biomed Inform. 2006 Dec;39(6):589-99. doi: 10.1016/j.jbi.2005.11.004. Epub 2005 Dec 5.


In this study, we evaluate the performance of a Natural Language Processing (NLP) application designed to extract medical problems from narrative text clinical documents. The documents come from a patient's electronic medical record and medical problems are proposed for inclusion in the patient's electronic problem list. This application has been developed to help maintain the problem list and make it more accurate, complete, and up-to-date. The NLP part of this system-analyzed in this study-uses the UMLS MetaMap Transfer (MMTx) application and a negation detection algorithm called NegEx to extract 80 different medical problems selected for their frequency of use in our institution. When using MMTx with its default data set, we measured a recall of 0.74 and a precision of 0.756. A custom data subset for MMTx was created, making it faster and significantly improving the recall to 0.896 with a non-significant reduction in precision.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Hospital Information Systems
  • Humans
  • Information Storage and Retrieval
  • Medical Records Systems, Computerized*
  • Medical Records, Problem-Oriented
  • Natural Language Processing*
  • Reproducibility of Results
  • Software
  • User-Computer Interface