Combining information from cancer registry and medical records data to improve analyses of adjuvant cancer therapies

Biometrics. 2009 Sep;65(3):946-52. doi: 10.1111/j.1541-0420.2008.01164.x. Epub 2009 Feb 4.


Cancer registry records contain valuable data on provision of adjuvant therapies for cancer patients. Previous studies, however, have shown that these therapies are underreported in registry systems. Hence direct use of the registry data may lead to invalid analysis results. We propose first to impute correct treatment status, borrowing information from an additional source such as medical records data collected in a validation sample, and then to analyze the multiply imputed data, as in Yucel and Zaslavsky (2005, Journal of the American Statistical Association 100, 1123-1132). We extend their models to multiple therapies using multivariate probit models with random effects. Our model takes into account the associations among different therapies in both administration and probability of reporting, as well as the multilevel structure (patients clustered within hospitals) of registry data. We use Gibbs sampling to estimate model parameters and impute treatment status. The proposed methodology is applied to the data from the Quality of Cancer Care project, in which stage II or III colorectal cancer patients were eligible to receive adjuvant chemotherapy and radiation therapy.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Adult
  • Antineoplastic Agents / therapeutic use*
  • Bayes Theorem
  • California
  • Chemotherapy, Adjuvant / statistics & numerical data*
  • Colorectal Neoplasms / drug therapy*
  • Colorectal Neoplasms / pathology
  • Data Collection
  • Female
  • Guideline Adherence
  • Hospitals / standards
  • Humans
  • Male
  • Medical Records / standards*
  • Medical Records / statistics & numerical data
  • Middle Aged
  • Neoplasm Staging
  • Quality of Health Care
  • Registries / standards*
  • Registries / statistics & numerical data


  • Antineoplastic Agents