Extracting information on pneumonia in infants using natural language processing of radiology reports

J Biomed Inform. 2005 Aug;38(4):314-21. doi: 10.1016/j.jbi.2005.02.003. Epub 2005 Mar 30.


Natural language processing (NLP) is critical for improvement of the healthcare process because it can encode clinical data in patient documents. Many clinical applications such as decision support require coded data to function appropriately. However, in order to be applicable for healthcare, performance must be adequate. A valuable automated application is the detection of infectious diseases, such as surveillance of pneumonia in newborns (e.g., neonates) because the disease produces significant rates of morbidity and mortality, and manual surveillance is challenging. Studies have demonstrated that automated surveillance using NLP is a useful adjunct to manual surveillance and an effective tool for infection control practitioners. This paper presents a study evaluating the feasibility of an NLP-based monitoring system to screen for healthcare-associated pneumonia in neonates. We estimated sensitivity, specificity, and positive predictive value by comparing results with clinicians' judgments. Sensitivity was 71% and specificity was 99%. Our results demonstrated that the automated method was feasible.

Publication types

  • Clinical Trial
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Database Management Systems
  • Diagnosis, Computer-Assisted / methods*
  • Expert Systems
  • Feasibility Studies
  • Humans
  • Infant, Newborn
  • Information Storage and Retrieval / methods
  • Mass Screening / methods*
  • Medical Records Systems, Computerized*
  • Natural Language Processing*
  • Pneumonia / diagnosis*
  • Pneumonia / epidemiology*
  • Radiology Information Systems*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • United States / epidemiology