Introduction: This study aims to describe the process of identifying people known to have diabetes through public data files, to validate this method, and to describe models for optimization of such identification processes.
Patients and methods: In a study population of 303,250 citizens, the diabetics were identified by combining information from public data files with information from general practitioners. Data validity was checked by comparing the results of data searches in public data files against information from general practitioners and a random sample of diabetics. Two models were defined to optimize the use of public data files for identification of diabetics. In model A the minimum number of parameters needed to obtain a sensitivity as high as possible was identified. In model B the optimal combination of parameters needed to obtain a high positive predictive value combined with a high sensitivity was identified.
Results: A total of 5449 diabetics were identified. Of those 4438 (81%) were classified as Type 2 diabetics and 1011 (19%) were classified as Type 1 diabetics. The data validation revealed that one person was misclassified as a diabetic and 93 persons were misclassified as non-diabetics. In model A the identification parameters included: "prescription", "HbA1c", "chiropodist service" and "glucose service". In model B the optimal combination of parameters was identified as: minimum two HbA1c measurements, minimum one visit to a chiropodist, minimum one prescription or minimum one abnormal HbA1c during one year.
Conclusion: Public data files are suitable for identification of both Type 1 and Type 2 diabetics. Models have been developed to identify diabetics and to promote the possibilities of long-term follow-up and quality assessment in an unselected diabetic population in a region.