Objective: To create surveillance algorithms to detect diabetes and classify type 1 versus type 2 diabetes using structured electronic health record (EHR) data.
Research design and methods: We extracted 4 years of data from the EHR of a large, multisite, multispecialty ambulatory practice serving ∼700,000 patients. We flagged possible cases of diabetes using laboratory test results, diagnosis codes, and prescriptions. We assessed the sensitivity and positive predictive value of novel combinations of these data to classify type 1 versus type 2 diabetes among 210 individuals. We applied an optimized algorithm to a live, prospective, EHR-based surveillance system and reviewed 100 additional cases for validation.
Results: The diabetes algorithm flagged 43,177 patients. All criteria contributed unique cases: 78% had diabetes diagnosis codes, 66% fulfilled laboratory criteria, and 46% had suggestive prescriptions. The sensitivity and positive predictive value of ICD-9 codes for type 1 diabetes were 26% (95% CI 12-49) and 94% (83-100) for type 1 codes alone; 90% (81-95) and 57% (33-86) for two or more type 1 codes plus any number of type 2 codes. An optimized algorithm incorporating the ratio of type 1 versus type 2 codes, plasma C-peptide and autoantibody levels, and suggestive prescriptions flagged 66 of 66 (100% [96-100]) patients with type 1 diabetes. On validation, the optimized algorithm correctly classified 35 of 36 patients with type 1 diabetes (raw sensitivity, 97% [87-100], population-weighted sensitivity, 65% [36-100], and positive predictive value, 88% [78-98]).
Conclusions: Algorithms applied to EHR data detect more cases of diabetes than claims codes and reasonably discriminate between type 1 and type 2 diabetes.