Development and validation of a clinical breast cancer tool for accurate prediction of recurrence

NPJ Breast Cancer. 2024 Jun 15;10(1):46. doi: 10.1038/s41523-024-00651-5.

Abstract

Given high costs of Oncotype DX (ODX) testing, widely used in recurrence risk assessment for early-stage breast cancer, studies have predicted ODX using quantitative clinicopathologic variables. However, such models have incorporated only small cohorts. Using a cohort of patients from the National Cancer Database (NCDB, n = 53,346), we trained machine learning models to predict low-risk (0-25) or high-risk (26-100) ODX using quantitative estrogen receptor (ER)/progesterone receptor (PR)/Ki-67 status, quantitative ER/PR status alone, and no quantitative features. Models were externally validated on a diverse cohort of 970 patients (median follow-up 55 months) for accuracy in ODX prediction and recurrence. Comparing the area under the receiver operating characteristic curve (AUROC) in a held-out set from NCDB, models incorporating quantitative ER/PR (AUROC 0.78, 95% CI 0.77-0.80) and ER/PR/Ki-67 (AUROC 0.81, 95% CI 0.80-0.83) outperformed the non-quantitative model (AUROC 0.70, 95% CI 0.68-0.72). These results were preserved in the validation cohort, where the ER/PR/Ki-67 model (AUROC 0.87, 95% CI 0.81-0.93, p = 0.009) and the ER/PR model (AUROC 0.86, 95% CI 0.80-0.92, p = 0.031) significantly outperformed the non-quantitative model (AUROC 0.80, 95% CI 0.73-0.87). Using a high-sensitivity rule-out threshold, the non-quantitative, quantitative ER/PR and ER/PR/Ki-67 models identified 35%, 30% and 43% of patients as low-risk in the validation cohort. Of these low-risk patients, fewer than 3% had a recurrence at 5 years. These models may help identify patients who can forgo genomic testing and initiate endocrine therapy alone. An online calculator is provided for further study.