Detecting chronic kidney disease in population-based administrative databases using an algorithm of hospital encounter and physician claim codes

BMC Nephrol. 2013 Apr 5:14:81. doi: 10.1186/1471-2369-14-81.


Background: Large, population-based administrative healthcare databases can be used to identify patients with chronic kidney disease (CKD) when serum creatinine laboratory results are unavailable. We examined the validity of algorithms that used combined hospital encounter and physician claims database codes for the detection of CKD in Ontario, Canada.

Methods: We accrued 123,499 patients over the age of 65 from 2007 to 2010. All patients had a baseline serum creatinine value to estimate glomerular filtration rate (eGFR). We developed an algorithm of physician claims and hospital encounter codes to search administrative databases for the presence of CKD. We determined the sensitivity, specificity, positive and negative predictive values of this algorithm to detect our primary threshold of CKD, an eGFR <45 mL/min per 1.73 m² (15.4% of patients). We also assessed serum creatinine and eGFR values in patients with and without CKD codes (algorithm positive and negative, respectively).

Results: Our algorithm required evidence of at least one of eleven CKD codes and 7.7% of patients were algorithm positive. The sensitivity was 32.7% [95% confidence interval: (95% CI): 32.0 to 33.3%]. Sensitivity was lower in women compared to men (25.7 vs. 43.7%; p <0.001) and in the oldest age category (over 80 vs. 66 to 80; 28.4 vs. 37.6 %; p < 0.001). All specificities were over 94%. The positive and negative predictive values were 65.4% (95% CI: 64.4 to 66.3%) and 88.8% (95% CI: 88.6 to 89.0%), respectively. In algorithm positive patients, the median [interquartile range (IQR)] baseline serum creatinine value was 135 μmol/L (106 to 179 μmol/L) compared to 82 μmol/L (69 to 98 μmol/L) for algorithm negative patients. Corresponding eGFR values were 38 mL/min per 1.73 m² (26 to 51 mL/min per 1.73 m²) vs. 69 mL/min per 1.73 m² (56 to 82 mL/min per 1.73 m²), respectively.

Conclusions: Patients with CKD as identified by our database algorithm had distinctly higher baseline serum creatinine values and lower eGFR values than those without such codes. However, because of limited sensitivity, the prevalence of CKD was underestimated.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Aged
  • Aged, 80 and over
  • Algorithms*
  • Cohort Studies
  • Databases, Factual / standards
  • Female
  • Hospital Administration / standards*
  • Hospitalization
  • Humans
  • International Classification of Diseases / standards*
  • Male
  • Ontario / epidemiology
  • Physicians / standards*
  • Population Surveillance* / methods
  • Renal Insufficiency, Chronic / diagnosis*
  • Renal Insufficiency, Chronic / epidemiology
  • Renal Insufficiency, Chronic / therapy
  • Retrospective Studies