Validity of cancer diagnosis in a primary care database compared with linked cancer registrations in England. Population-based cohort study

Cancer Epidemiol. 2012 Oct;36(5):425-9. doi: 10.1016/j.canep.2012.05.013. Epub 2012 Jun 21.


Aims: The present study aimed to evaluate the validity of cancer diagnoses and death recording in a primary care database compared with cancer registry (CR) data in England.

Methods: The eligible cohort comprised 42,556 participants, registered with English general practices in the General Practice Research Database (GPRD) that consented to CR linkage. CR and primary care records were compared for cancer diagnosis, date of cancer diagnosis and death. Read and ICD cancer code sets were reviewed and agreed by two authors.

Results: There were 5216 (91% of CR total) cancer events diagnosed in both sources. There were 494 (9%) diagnosed in CR only and 213 (4%) that were diagnosed in GPRD only. The predictive value of a GPRD cancer diagnosis was 96% for lung cancer, 92% for urinary tract cancer, 96% for gastro-oesophageal cancer and 98% for colorectal cancer. 'False negative' primary care records were sometimes accounted for by registration end dates being shortly before cancer diagnosis dates. The date of cancer diagnosis was median 11 (interquartile range -6 to 30) days later in GPRD compared with CR. Death records were consistent for the two sources for 3337/3397 (99%) of cases.

Conclusion: Recording of cancer diagnosis and mortality in primary care electronic records is generally consistent with CR in England. Linkage studies must pay careful attention to selection of codes to define eligibility and timing of diagnoses in relation to beginning and end of record.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Cohort Studies
  • Databases, Factual / statistics & numerical data*
  • Death Certificates*
  • Electronic Health Records / statistics & numerical data*
  • England / epidemiology
  • False Negative Reactions
  • False Positive Reactions
  • Female
  • Humans
  • Male
  • Medical Record Linkage
  • Neoplasms / classification
  • Neoplasms / diagnosis*
  • Neoplasms / epidemiology*
  • Predictive Value of Tests
  • Primary Health Care / statistics & numerical data*
  • Registries / statistics & numerical data*
  • Survival Rate