Linking HIV and Viral Hepatitis Surveillance Data: Evaluating a Standard, Deterministic Matching Algorithm Using Data From 6 US Health Jurisdictions

Am J Epidemiol. 2018 Nov 1;187(11):2415-2422. doi: 10.1093/aje/kwy161.

Abstract

Accurate interpretations and comparisons of record linkage results across jurisdictions require valid and reliable matching methods. We compared existing matching methods used by 6 US state and local health departments (Houston, Texas; Louisiana; Michigan; New York, New York; North Dakota; and Wisconsin) to link human immunodeficiency virus and viral hepatitis surveillance data with a 14-key automated, hierarchical deterministic matching method. Applicable years of study varied by disease and jurisdiction, ranging from 1979 to 2016. We calculated percentage agreement and Cohen's κ coefficient to compare the matching methods used within each jurisdiction. We calculated sensitivity, specificity, and positive predictive value for each matching method, as compared with a new standard that included manual review of discrepant cases. Agreement between the existing matching method and the deterministic matching method was 99.6% or higher in all jurisdictions; Cohen's κ values ranged from 0.87 to 0.98. The sensitivity of the deterministic matching method ranged from 97.4% to 100% in the 6 jurisdictions; specificity ranged from 99.7% to 100%; and positive predictive value ranged from 97.4% to 100%. Although no gold standard exists, prior assessments of existing methods and review of discrepant classifications suggest good accuracy and reliability of our deterministic matching method, with the advantage that our method reduces the need for manual review and allows for standard comparisons across jurisdictions when linking human immunodeficiency virus and viral hepatitis data.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • HIV Infections / epidemiology*
  • Hepatitis B / epidemiology*
  • Hepatitis C / epidemiology*
  • Humans
  • Medical Record Linkage / methods*
  • Medical Record Linkage / standards
  • Public Health Surveillance / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • United States / epidemiology