Performance characteristics of code-based algorithms to identify urinary tract infections in large United States administrative claims databases

Pharmacoepidemiol Drug Saf. 2022 Sep;31(9):953-962. doi: 10.1002/pds.5492. Epub 2022 Jul 4.

Abstract

Background: In real-world evidence research, reliability of coding in healthcare databases dictates the accuracy of code-based algorithms in identifying conditions such as urinary tract infection (UTI). This study evaluates the performance characteristics of code-based algorithms to identify UTI.

Methods: Retrospective observational study of adults contained within three large U.S. administrative claims databases on or after January 1, 2010. A targeted literature review was performed to inform the development of 10 code-based algorithms to identify UTIs consisting of combinations of diagnosis codes, antibiotic exposure for the treatment of UTIs, and/or ordering of a urinalysis or urine culture. For each database, a probabilistic gold standard was developed using PheValuator. The performance characteristics of each code-based algorithm were assessed compared with the probabilistic gold standard.

Results: A total of 2 950 641, 1 831 405, and 2 294 929 patients meeting study criteria were identified in each database. Overall, the code-based algorithm requiring a primary UTI diagnosis code achieved the highest positive predictive values (PPV; >93.8%) but the lowest sensitivities (<12.9%). Algorithms requiring three UTI diagnosis codes achieved similar PPV (>0.899%) and improved sensitivity (<41.6%). Algorithms requiring a single UTI diagnosis code in any position achieved the highest sensitivities (>72.1%) alongside a slight reduction in PPVs (<78.3%). All-time prevalence estimates of UTI ranged from 21.6% to 48.6%.

Conclusions: Based on these findings, we recommend use of algorithms requiring a single UTI diagnosis code, which achieved high sensitivity and PPV. In studies where PPV is critical, we recommend code-based algorithms requiring three UTI diagnosis codes rather than a single primary UTI diagnosis code.

Keywords: PheValuator; administrative claims databases; code-based algorithms; observational research; performance characteristics; phenotype; urinary tract infections.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Databases, Factual
  • Humans
  • Observational Studies as Topic
  • Reproducibility of Results
  • United States / epidemiology
  • Urinalysis
  • Urinary Tract Infections* / diagnosis
  • Urinary Tract Infections* / drug therapy
  • Urinary Tract Infections* / epidemiology