Integration of Regional Hospitalizations, Registry and Vital Statistics Data for Development of a Single Statewide Ischemic Stroke Database

J Stroke Cerebrovasc Dis. 2022 Mar;31(3):106236. doi: 10.1016/j.jstrokecerebrovasdis.2021.106236. Epub 2021 Dec 23.


Objective: Administrative databases seldom include detailed clinical variables and vital status, limiting the scope of population-based studies. We demonstrate a comprehensive process for integrating 3 databases (all-payor inpatient hospitalizations, clinical acute stroke registry and vital statistics) into a single statewide ischemic stroke database.

Materials and methods: The 3 Massachusetts databases spanned 2007-2017. Our integration process was composed of 3 phases: 1) hospitalizations-registry linkage, 2) hospitalizations-vital linkage, and 3) final integration of all 3 databases. Following data uniqueness assessment, rule-based deterministic linkage on indirect identifiers were applied in the first two phases. We validated the linkages by comparing additional patient variables not used in the linkage process in the absence of a gold standard database crosswalk.

Results: During the overlapping period from 1/1/2008 to 9/30/2015, there were 47,713 stroke admissions in the hospitalizations database and 43,487 admissions in the registry. We linked 38,493 (80.7%) of cases, 95% of which were validated. There were 391,176 deaths reported in Massachusetts between 1/1/2010 and 3/6/2017 in the vital database. Of the 38,493 encounters in the hospitalizations-registry linked data, 10,660 (27.7%) were linked to deaths, reflecting the cumulative mortality over the 7-year period among all registry-linked ischemic stroke hospitalization records.

Conclusion: We demonstrate that a high-quality integration of the statewide hospitalizations, clinical registry, and vital statistics databases is achievable leveraging indirect identifiers. This data integration framework takes advantage of rich clinical data in registries and long term outcomes from hospitalizations and vital records and may have value for larger scale outcomes research.

Keywords: Clinical registry; Database integration; Health services research; Ischemic stroke; Vital statistics.

MeSH terms

  • Databases, Factual*
  • Hospitalization
  • Humans
  • Ischemic Stroke* / epidemiology
  • Ischemic Stroke* / therapy
  • Massachusetts / epidemiology
  • Registries
  • Vital Statistics