Validation sampling can reduce bias in health care database studies: an illustration using influenza vaccination effectiveness

J Clin Epidemiol. 2013 Aug;66(8 Suppl):S110-21. doi: 10.1016/j.jclinepi.2013.01.015.


Objectives: Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased owing to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias.

Study design and setting: We applied two such methods, namely imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method's ability to reduce bias using the control time period before influenza circulation.

Results: Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not use the validation sample confounders.

Conclusion: Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from health care database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which the data can be imputed or reweighted using the additional validation sample information.

Keywords: Aged; Bias (epidemiologic); Comparative effectiveness research; Confounding factors (epidemiology); Influenza vaccines; Propensity score.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Cohort Studies
  • Comparative Effectiveness Research / statistics & numerical data*
  • Databases, Factual / statistics & numerical data
  • Epidemiologic Factors
  • Female
  • Health Status
  • Humans
  • Influenza, Human / prevention & control*
  • Male
  • Mortality*
  • Outcome Assessment, Health Care / statistics & numerical data*
  • Statistics as Topic*
  • Vaccination / mortality*
  • Washington / epidemiology