An algorithm for identification and classification of individuals with type 1 and type 2 diabetes mellitus in a large primary care database

Clin Epidemiol. 2016 Oct 12:8:373-380. doi: 10.2147/CLEP.S113415. eCollection 2016.


Background: Research into diabetes mellitus (DM) often requires a reproducible method for identifying and distinguishing individuals with type 1 DM (T1DM) and type 2 DM (T2DM).

Objectives: To develop a method to identify individuals with T1DM and T2DM using UK primary care electronic health records.

Methods: Using data from The Health Improvement Network primary care database, we developed a two-step algorithm. The first algorithm step identified individuals with potential T1DM or T2DM based on diagnostic records, treatment, and clinical test results. We excluded individuals with records for rarer DM subtypes only. For individuals to be considered diabetic, they needed to have at least two records indicative of DM; one of which was required to be a diagnostic record. We then classified individuals with T1DM and T2DM using the second algorithm step. A combination of diagnostic codes, medication prescribed, age at diagnosis, and whether the case was incident or prevalent were used in this process. We internally validated this classification algorithm through comparison against an independent clinical examination of The Health Improvement Network electronic health records for a random sample of 500 DM individuals.

Results: Out of 9,161,866 individuals aged 0-99 years from 2000 to 2014, we classified 37,693 individuals with T1DM and 418,433 with T2DM, while 1,792 individuals remained unclassified. A small proportion were classified with some uncertainty (1,155 [3.1%] of all individuals with T1DM and 6,139 [1.5%] with T2DM) due to unclear health records. During validation, manual assignment of DM type based on clinical assessment of the entire electronic record and algorithmic assignment led to equivalent classification in all instances.

Conclusion: The majority of individuals with T1DM and T2DM can be readily identified from UK primary care electronic health records. Our approach can be adapted for use in other health care settings.

Keywords: algorithm; databases; diabetes and endocrinology; epidemiology; public health.