Objective: To evaluate the performance of a natural language processing system in extracting pneumonia-related concepts from chest x-ray reports.
Design: Four physicians, three lay persons, a natural language processing system, and two keyword searches (designated AAKS and KS) detected the presence or absence of three pneumonia-related concepts and inferred the presence or absence of acute bacterial pneumonia from 292 chest x-ray reports. Gold standard: Majority vote of three independent physicians. Reliability of the gold standard was measured.
Outcome measures: Recall, precision, specificity, and agreement (using Finn's R: statistic) with respect to the gold standard. Differences between the physicians and the other subjects were tested using the McNemar test for each pneumonia concept and for the disease inference of acute bacterial pneumonia.
Results: Reliability of the reference standard ranged from 0.86 to 0.96. Recall, precision, specificity, and agreement (Finn R:) for the inference on acute bacterial pneumonia were, respectively, 0.94, 0.87, 0.91, and 0.84 for physicians; 0.95, 0.78, 0.85, and 0.75 for natural language processing system; 0.46, 0.89, 0.95, and 0.54 for lay persons; 0.79, 0.63, 0.71, and 0.49 for AAKS; and 0.87, 0.70, 0.77, and 0.62 for KS. The McNemar pairwise comparisons showed differences between one physician and the natural language processing system for the infiltrate concept and between another physician and the natural language processing system for the inference on acute bacterial pneumonia. The comparisons also showed that most physicians were significantly different from the other subjects in all pneumonia concepts and the disease inference.
Conclusion: In extracting pneumonia related concepts from chest x-ray reports, the performance of the natural language processing system was similar to that of physicians and better than that of lay persons and keyword searches. The encoded pneumonia information has the potential to support several pneumonia-related applications used in our institution. The applications include a decision support system called the antibiotic assistant, a computerized clinical protocol for pneumonia, and a quality assurance application in the radiology department.