When Does Differential Outcome Misclassification Matter for Estimating Prevalence?

Epidemiology. 2023 Mar 1;34(2):192-200. doi: 10.1097/EDE.0000000000001572. Epub 2022 Dec 29.


Background: When accounting for misclassification, investigators make assumptions about whether misclassification is "differential" or "nondifferential." Most guidance on differential misclassification considers settings where outcome misclassification varies across levels of exposure, or vice versa. Here, we examine when covariate-differential misclassification must be considered when estimating overall outcome prevalence.

Methods: We generated datasets with outcome misclassification under five data generating mechanisms. In each, we estimated prevalence using estimators that (a) ignored misclassification, (b) assumed misclassification was nondifferential, and (c) allowed misclassification to vary across levels of a covariate. We compared bias and precision in estimated prevalence in the study sample and an external target population using different sources of validation data to account for misclassification. We illustrated use of each approach to estimate HIV prevalence using self-reported HIV status among people in East Africa cross-border areas.

Results: The estimator that allowed misclassification to vary across levels of the covariate produced results with little bias for both populations in all scenarios but had higher variability when the validation study contained sparse strata. Estimators that assumed nondifferential misclassification produced results with little bias when the covariate distribution in the validation data matched the covariate distribution in the target population; otherwise estimates assuming nondifferential misclassification were biased.

Conclusions: If validation data are a simple random sample from the target population, assuming nondifferential outcome misclassification will yield prevalence estimates with little bias regardless of whether misclassification varies across covariates. Otherwise, obtaining valid prevalence estimates requires incorporating covariates into the estimators used to account for misclassification.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • HIV Infections* / epidemiology
  • Humans
  • Prevalence
  • Research Design*
  • Self Report