Assessing record linkage between health care and Vital Statistics databases using deterministic methods

BMC Health Serv Res. 2006 Apr 5;6:48. doi: 10.1186/1472-6963-6-48.


Background: We assessed the linkage and correct linkage rate using deterministic record linkage among three commonly used Canadian databases, namely, the population registry, hospital discharge data and Vital Statistics registry.

Methods: Three combinations of four personal identifiers (surname, first name, sex and date of birth) were used to determine the optimal combination. The correct linkage rate was assessed using a unique personal health number available in all three databases.

Results: Among the three combinations, the combination of surname, sex, and date of birth had the highest linkage rate of 88.0% and 93.1%, and the second highest correct linkage rate of 96.9% and 98.9% between the population registry and Vital Statistics registry, and between the hospital discharge data and Vital Statistics registry in 2001, respectively. Adding the first name to the combination of the three identifiers above increased correct linkage by less than 1%, but at the cost of lowering the linkage rate almost by 10%.

Conclusion: Our findings suggest that the combination of surname, sex and date of birth appears to be optimal using deterministic linkage. The linkage and correct linkage rates appear to vary by age and the type of database, but not by sex.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Birth Certificates
  • Canada / epidemiology
  • Child
  • Child, Preschool
  • Databases, Factual
  • Death Certificates
  • Female
  • Hospital Records / statistics & numerical data
  • Humans
  • Infant
  • Male
  • Medical Record Linkage*
  • Middle Aged
  • Patient Discharge / statistics & numerical data
  • Patient Identification Systems
  • Population Surveillance
  • Public Health Informatics*
  • Registries
  • Vital Statistics*