Accuracy of phenotyping chronic rhinosinusitis in the electronic health record

Joy Hsu; Jennifer A Pacheco; Whitney W Stevens; Maureen E Smith; Pedro C Avila

doi:10.2500/ajra.2014.28.4012

Accuracy of phenotyping chronic rhinosinusitis in the electronic health record

Am J Rhinol Allergy. 2014 Mar-Apr;28(2):140-4. doi: 10.2500/ajra.2014.28.4012.

Authors

Joy Hsu¹, Jennifer A Pacheco, Whitney W Stevens, Maureen E Smith, Pedro C Avila

Affiliation

¹ Division of Allergy-Immunology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.

Abstract

Background: Chronic rhinosinusitis (CRS) is prevalent, morbid, and poorly understood. Extraction of electronic health record (EHR) data of patients with CRS may facilitate research on CRS. However, the accuracy of using structured billing codes for EHR-driven phenotyping of CRS is unknown. We sought to accurately identify CRS cases and controls using EHR data and to determine the accuracy of structured billing codes for identifying patients with CRS.

Methods: We developed and validated distinct algorithms to identify patients with CRS and controls using International Classification of Diseases, Ninth Revision (ICD-9) and Current Procedural Terminology codes. We used blinded clinician chart review as the reference standard to evaluate algorithm and billing code accuracy.

Results: Our initial control algorithm achieved a control positive predictive value (PPV) of 100% (i.e., negative predictive value of 100% for CRS). Our initial algorithm for CRS cases relied exclusively on billing codes and had a low case PPV (54%). Notably, ICD-9 code 471.x was associated with a case PPV of 85%, whereas the case PPV of ICD-9 code 473.x was only 34%. After multiple algorithm iterations, we increased the case PPV of our final algorithm to 91% by adding several requirements, e.g., that ICD-9 codes occur with 1 or more evaluations by a CRS specialist to enhance availability of objective clinical data for accurately phenotyping CRS.

Conclusion: These algorithms are an important first step to identify patients with CRS, and may facilitate EHR-based research on CRS pathogenesis, morbidity, and management. Exclusive use of coded data for phenotyping CRS has limited accuracy, especially because CRS symptomatology overlaps with that of other illnesses. Incorporating natural language processing (e.g., to evaluate results of nasal endoscopy or sinus computed tomography) into future work may increase algorithm accuracy and identify patients whose disease status may not be ascertained by only using billing codes.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Chronic Disease
Electronic Health Records*
Endoscopy
Humans
International Classification of Diseases
Natural Language Processing
Observer Variation
Phenotype
Predictive Value of Tests
Reference Standards
Reproducibility of Results
Rhinitis / classification
Rhinitis / diagnosis*
Rhinitis / economics
Sinusitis / classification
Sinusitis / diagnosis*
Sinusitis / economics
Tomography, X-Ray Computed

Abstract

Publication types

MeSH terms

Grants and funding