Purpose: This study determined the validity of claims-based definitions for identifying the incidence of total and site-specific cancers in a population-based cohort study.
Methods: Claims data were obtained for 21 946 participants aged 40-74 years enrolled in the Japan Public Health Center-based Prospective Study for the Next Generation. We defined total and site-specific cancer incidence using combinations of codes from claims data, including diagnosis and procedure codes for cancer therapy. Data from the cancer registry were used as the gold standard to evaluate validity.
Results: Among 21 946 participants, 454 total, 89 stomach, 67 colorectal, 51 lung, 39 breast and 99 prostate invasive cancer cases were newly diagnosed in the cancer registry. For invasive cancer, the sensitivity and specificity of the definition that combined codes for diagnosis and procedures for cancer therapy were 87.0% and 99.4% for total, 88.8% and 99.9% for stomach, 80.6% and 99.9% for colorectal, 86.3% and 99.9% for lung, 100% and 99.9% for breast and 91.9% and 99.9% for prostate cancer, respectively. Furthermore, for invasive and/or in situ cancer, the sensitivity and specificity of the definition were 84.5% and 99.5% for total, 66.7% and 99.9% for colorectal and 100% and 99.9% for breast cancer.
Conclusions: Our findings suggest that claims-based definitions using diagnosis and procedure codes generally have high validity for total, stomach, lung, breast and prostate cancer incidence, but may underestimate colorectal cancer incidence.
Keywords: cancer incidence; cancer registry; claims data; general population; total cancer; validation.
© 2022 John Wiley & Sons Ltd.