A natural language processing system to extract and code concepts relating to congestive heart failure from chest radiology reports

AMIA Annu Symp Proc. 2006;2006:269-73.


We have developed a natural language processing system for extracting and coding clinical data from free text reports. The system is designed to be easily modified and adapted to a variety of free text clinical reports such as admission notes, radiology and pathology reports, and discharge summaries. This report presents the results of this system to extract and code clinical concepts related to congestive heart failure from 39,000 chest radiology reports. The system detects the presence or absence of six concepts: congestive heart failure, Kerley B lines, cardiomegaly, prominent pulmonary vasculature, pulmonary edema, and pleural effusion. We compared it's output to a gold standard which consisted of specially trained human coders as well as an experienced physician. Results indicate that the system had high specificity, recall and precision for each of the concepts it is designed to detect.

Publication types

  • Comparative Study
  • Validation Study

MeSH terms

  • Forms and Records Control*
  • Heart Failure / classification
  • Heart Failure / diagnostic imaging*
  • Humans
  • Natural Language Processing*
  • Predictive Value of Tests
  • Radiography, Thoracic / classification*
  • Sensitivity and Specificity