Analysis of identifier performance using a deterministic linkage algorithm
- PMID: 12463836
- PMCID: PMC2244404
Analysis of identifier performance using a deterministic linkage algorithm
Abstract
As part of developing a record linkage algorithm using de-identified patient data, we analyzed the performance of several demographic variables for making linkages between patient registry records from two hospital registries and the Social Security Death Master File. We analyzed samples from each registry totaling 6,000 record-pairs to establish a linkage gold-standard. Using Social Security Number as the exclusive linkage variable resulted in substantial linkage error rates of 4.7% and 9.2%. The best single variable combination for finding links was Social Security Number, phonetically compressed first name, birth month, and gender. This found 87% and 88% of the links without any false links. We achieved sensitivities of 90% to 92% while maintaining 100% specificity using combinations of social security number, gender, name, and birth date fields. This represents an accurate method for linking patient records to death data and is the basis for a more generalized de-identified linkage algorithm.
Similar articles
-
Adding value to clinical data by linkage to a public death registry.Stud Health Technol Inform. 2001;84(Pt 2):1384-8. Stud Health Technol Inform. 2001. PMID: 11604954
-
Analysis of a probabilistic record linkage technique without human review.AMIA Annu Symp Proc. 2003;2003:259-63. AMIA Annu Symp Proc. 2003. PMID: 14728174 Free PMC article.
-
Validity of deterministic record linkage using multiple indirect personal identifiers: linking a large registry to claims data.Circ Cardiovasc Qual Outcomes. 2014 May;7(3):475-80. doi: 10.1161/CIRCOUTCOMES.113.000294. Epub 2014 Apr 22. Circ Cardiovasc Qual Outcomes. 2014. PMID: 24755909
-
[Linking of individual data. Methods of linkage].Rev Epidemiol Sante Publique. 1997 Jun;45(3):248-56. Rev Epidemiol Sante Publique. 1997. PMID: 9280988 Review. French.
-
Probabilistic record linkage and a method to calculate the positive predictive value.Int J Epidemiol. 2002 Dec;31(6):1246-52. doi: 10.1093/ije/31.6.1246. Int J Epidemiol. 2002. PMID: 12540730 Review.
Cited by
-
Evaluation of real-world referential and probabilistic patient matching to advance patient identification strategy.J Am Med Inform Assoc. 2022 Jul 12;29(8):1409-1415. doi: 10.1093/jamia/ocac068. J Am Med Inform Assoc. 2022. PMID: 35568993 Free PMC article.
-
Identifying nonfatal firearm assault incidents through linking police data and clinical records: Cohort study in Indianapolis, Indiana, 2007-2016.Prev Med. 2021 Aug;149:106605. doi: 10.1016/j.ypmed.2021.106605. Epub 2021 May 13. Prev Med. 2021. PMID: 33992657 Free PMC article.
-
Alliances to disseminate addiction prevention and treatment (ADAPT): A statewide learning health system to reduce substance use among justice-involved youth in rural communities.J Subst Abuse Treat. 2021 Sep;128:108368. doi: 10.1016/j.jsat.2021.108368. Epub 2021 Mar 16. J Subst Abuse Treat. 2021. PMID: 33867210 Free PMC article.
-
Two-year prevalence rates of mental health and substance use disorder diagnoses among repeat arrestees.Health Justice. 2021 Jan 7;9(1):2. doi: 10.1186/s40352-020-00126-2. Health Justice. 2021. PMID: 33411067 Free PMC article.
-
Universal Patient Identifier and Interoperability for Detection of Serious Drug Interactions: Retrospective Study.JMIR Med Inform. 2020 Nov 20;8(11):e23353. doi: 10.2196/23353. JMIR Med Inform. 2020. PMID: 33216009 Free PMC article.
References
Publication types
MeSH terms
Grant support
LinkOut - more resources
Full Text Sources
Research Materials