Population-level surveillance of congenital heart defects among adolescents and adults in Colorado: Implications of record linkage

Am Heart J. 2020 Aug;226:75-84. doi: 10.1016/j.ahj.2020.04.008. Epub 2020 Apr 19.


Background: The objective was to describe the design of a population-level electronic health record (EHR) and insurance claims-based surveillance system of adolescents and adults with congenital heart defects (CHDs) in Colorado and to evaluate the bias introduced by duplicate cases across data sources.

Methods: The Colorado CHD Surveillance System ascertained individuals aged 11-64 years with a CHD based on International Classification of Diseases, Ninth Revision, Clinical Modification diagnostic coding between 2011 and 2013 from a diverse network of health care systems and an All Payer Claims Database (APCD). A probability-based identity reconciliation algorithm identified duplicate cases. Logistic regression was conducted to investigate bias introduced by duplicate cases on the relationship between CHD severity (severe compared to moderate/mild) and adverse outcomes including all-cause mortality, inpatient hospitalization, and major adverse cardiac events (myocardial infarction, congestive heart failure, or cerebrovascular event). Sensitivity analyses were conducted to investigate bias introduced by the sole use or exclusion of APCD data.

Results: A total of 12,293 unique cases were identified, of which 3,476 had a within or between data source duplicate. Duplicate cases were more likely to be in the youngest age group and have private health insurance, a severe heart defect, a CHD comorbidity, and higher health care utilization. We found that failure to resolve duplicate cases between data sources would inflate the relationship between CHD severity and both morbidity and mortality outcomes by ~15%. Sensitivity analyses indicate that scenarios in which APCD was excluded from case finding or relied upon as the sole source of case finding would also result in an overestimation of the relationship between a CHD severity and major adverse outcomes.

Discussion: Aggregated EHR- and claims-based surveillance systems of adolescents and adults with CHD that fail to account for duplicate records will introduce considerable bias into research findings.

Conclusion: Population-level surveillance systems for rare chronic conditions, such as congenital heart disease, based on aggregation of EHR and claims data require sophisticated identity reconciliation methods to prevent bias introduced by duplicate cases.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Adult
  • Bias
  • Child
  • Colorado / epidemiology
  • Electronic Health Records
  • Female
  • Heart Defects, Congenital / epidemiology*
  • Humans
  • Information Storage and Retrieval / statistics & numerical data*
  • Insurance Claim Reporting
  • Male
  • Medical Record Linkage*
  • Middle Aged
  • Population Surveillance / methods*
  • Young Adult