Assessment of the Accuracy of Using ICD-9 Diagnosis Codes to Identify Pneumonia Etiology in Patients Hospitalized With Pneumonia

JAMA Netw Open. 2020 Jul 1;3(7):e207750. doi: 10.1001/jamanetworkopen.2020.7750.


Importance: Administrative databases may offer efficient clinical data collection for studying epidemiology, outcomes, and temporal trends in health care delivery. However, such data have seldom been validated against microbiological laboratory results.

Objective: To assess the validity of International Classification of Diseases, Ninth Revision (ICD-9) organism-specific administrative codes for pneumonia using microbiological data (test results for blood or respiratory culture, urinary antigen, or polymerase chain reaction) as the criterion standard.

Design, setting, and participants: Cross-sectional diagnostic accuracy study conducted between February 2017 and June 2019 using data from 178 US hospitals in the Premier Healthcare Database. Patients were aged 18 years or older admitted with pneumonia and discharged between July 1, 2010, and June 30, 2015. Data were analyzed from February 14, 2017, to June 27, 2019.

Exposures: Organism-specific pneumonia identified from ICD-9 codes.

Main outcomes and measures: Sensitivity, specificity, positive predictive value, and negative predictive value of ICD-9 codes using microbiological data as the criterion standard.

Results: Of 161 529 patients meeting inclusion criteria (mean [SD] age, 69.5 [16.2] years; 51.2% women), 35 759 (22.1%) had an identified pathogen. ICD-9-coded organisms and laboratory findings differed notably: for example, ICD-9 codes identified only 14.2% and 17.3% of patients with laboratory-detected methicillin-sensitive Staphylococcus aureus and Escherichia coli, respectively. Although specificities and negative predictive values exceeded 95% for all codes, sensitivities ranged downward from 95.9% (95% CI, 95.3%-96.5%) for influenza virus to 14.0% (95% CI, 8.8%-20.8%) for parainfluenza virus, and positive predictive values ranged downward from 91.1% (95% CI, 89.5%-92.6%) for Staphylococcus aureus to 57.1% (95% CI, 39.4%-73.7%) for parainfluenza virus.

Conclusions and relevance: In this study, ICD-9 codes did not reliably capture pneumonia etiology identified by laboratory testing; because of the high specificities of ICD-9 codes, however, administrative data may be useful in identifying risk factors for resistant organisms. The low sensitivities of the diagnosis codes may limit the validity of organism-specific pneumonia prevalence estimates derived from administrative data.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Aged
  • Cross-Sectional Studies
  • Databases, Factual / statistics & numerical data
  • Female
  • Hospitalization / statistics & numerical data*
  • Humans
  • Inpatients / statistics & numerical data
  • International Classification of Diseases / standards*
  • Male
  • Microbiological Techniques* / methods
  • Microbiological Techniques* / standards
  • Middle Aged
  • Pneumonia* / epidemiology
  • Pneumonia* / etiology
  • Pneumonia* / microbiology
  • Pneumonia* / therapy
  • Predictive Value of Tests
  • Sensitivity and Specificity
  • United States / epidemiology