Natural Language Processing to identify pneumonia from radiology reports

Pharmacoepidemiol Drug Saf. 2013 Aug;22(8):834-41. doi: 10.1002/pds.3418. Epub 2013 Apr 1.


Purpose: This study aimed to develop Natural Language Processing (NLP) approaches to supplement manual outcome validation, specifically to validate pneumonia cases from chest radiograph reports.

Methods: We trained one NLP system, ONYX, using radiograph reports from children and adults that were previously manually reviewed. We then assessed its validity on a test set of 5000 reports. We aimed to substantially decrease manual review, not replace it entirely, and so, we classified reports as follows: (1) consistent with pneumonia; (2) inconsistent with pneumonia; or (3) requiring manual review because of complex features. We developed processes tailored either to optimize accuracy or to minimize manual review. Using logistic regression, we jointly modeled sensitivity and specificity of ONYX in relation to patient age, comorbidity, and care setting. We estimated positive and negative predictive value (PPV and NPV) assuming pneumonia prevalence in the source data.

Results: Tailored for accuracy, ONYX identified 25% of reports as requiring manual review (34% of true pneumonias and 18% of non-pneumonias). For the remainder, ONYX's sensitivity was 92% (95% CI 90-93%), specificity 87% (86-88%), PPV 74% (72-76%), and NPV 96% (96-97%). Tailored to minimize manual review, ONYX classified 12% as needing manual review. For the remainder, ONYX had sensitivity 75% (72-77%), specificity 95% (94-96%), PPV 86% (83-88%), and NPV 91% (90-91%).

Conclusions: For pneumonia validation, ONYX can replace almost 90% of manual review while maintaining low to moderate misclassification rates. It can be tailored for different outcomes and study needs and thus warrants exploration in other settings.

Keywords: Natural Language Processing; pharmacoepidemiology; pneumonia; sensitivity; specificity; validity.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Adolescent
  • Adult
  • Age Factors
  • Aged
  • Aged, 80 and over
  • Child
  • Child, Preschool
  • Humans
  • Infant
  • Logistic Models
  • Middle Aged
  • Natural Language Processing*
  • Pharmacoepidemiology*
  • Pneumonia / diagnosis*
  • Pneumonia / diagnostic imaging
  • Pneumonia / epidemiology
  • Predictive Value of Tests
  • Prevalence
  • Radiography
  • Young Adult