Identifying dementia cases with routinely collected health data: A systematic review

Alzheimers Dement. 2018 Aug;14(8):1038-1051. doi: 10.1016/j.jalz.2018.02.016. Epub 2018 Apr 3.


Introduction: Prospective, population-based studies can be rich resources for dementia research. Follow-up in many such studies is through linkage to routinely collected, coded health-care data sets. We evaluated the accuracy of these data sets for dementia case identification.

Methods: We systematically reviewed the literature for studies comparing dementia coding in routinely collected data sets to any expert-led reference standard. We recorded study characteristics and two accuracy measures-positive predictive value (PPV) and sensitivity.

Results: We identified 27 eligible studies with 25 estimating PPV and eight estimating sensitivity. Study settings and methods varied widely. For all-cause dementia, PPVs ranged from 33%-100%, but 16/27 were >75%. Sensitivities ranged from 21% to 86%. PPVs for Alzheimer's disease (range 57%-100%) were generally higher than those for vascular dementia (range 19%-91%).

Discussion: Linkage to routine health-care data can achieve a high PPV and reasonable sensitivity in certain settings. Given the heterogeneity in accuracy estimates, cohorts should ideally conduct their own setting-specific validation.

Keywords: Alzheimer's disease; Clinical coding; Cohort studies; Dementia; Epidemiology; Positive predictive value; Predictive value of tests; Prospective studies; Sensitivity; Vascular.

Publication types

  • Research Support, Non-U.S. Gov't
  • Systematic Review

MeSH terms

  • Alzheimer Disease / diagnosis*
  • Alzheimer Disease / epidemiology
  • Clinical Coding / standards
  • Data Collection / standards*
  • Delivery of Health Care*
  • Dementia, Vascular / diagnosis
  • Dementia, Vascular / epidemiology
  • Humans
  • Sensitivity and Specificity