Adjustment of cancer incidence rates for ethnic misclassification

Biometrics. 1998 Jun;54(2):774-81.


Although ethnic population counts measured by the United States Census are based on self-identification, the same is not necessarily true of cases reported to cancer registries. The use of different ethnic classification methods for numerators and denominators may therefore lead to biased estimates of cancer incidence rates. The extent of such misclassification may be assessed by conducting an ethnicity survey of cancer patients and estimating the proportion misclassified using double sampling models that account for sample stratification. For two ethnic categories, logistic regression may be used to model self-identified ethnicity as a function of demographic variables and the fallible classification method. Incidence rates then may be adjusted for misclassification using regression results to estimate the number of cancer cases of a given age, sex, and site in each self-identified ethnic group. An example is given using this method to estimate ethnic misclassification of San Francisco Bay area Hispanic cancer patients diagnosed in 1990. Results suggest that the number of cancer cases reported as Hispanic is an underestimate of the number of cases self-identified as Hispanic, resulting in an underestimate of Hispanic cancer rates.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Data Interpretation, Statistical
  • Ethnic Groups / classification*
  • Female
  • Hispanic Americans
  • Humans
  • Incidence
  • Male
  • Models, Statistical
  • Neoplasms / epidemiology*
  • Neoplasms / ethnology
  • Reproducibility of Results
  • San Francisco / epidemiology