Genomic features of rapid versus late relapse in triple negative breast cancer

BMC Cancer. 2021 May 18;21(1):568. doi: 10.1186/s12885-021-08320-7.


Background: Triple-negative breast cancer (TNBC) is a heterogeneous disease and we have previously shown that rapid relapse of TNBC is associated with distinct sociodemographic features. We hypothesized that rapid versus late relapse in TNBC is also defined by distinct clinical and genomic features of primary tumors.

Methods: Using three publicly-available datasets, we identified 453 patients diagnosed with primary TNBC with adequate follow-up to be characterized as 'rapid relapse' (rrTNBC; distant relapse or death ≤2 years of diagnosis), 'late relapse' (lrTNBC; > 2 years) or 'no relapse' (nrTNBC: > 5 years no relapse/death). We explored basic clinical and primary tumor multi-omic data, including whole transcriptome (n = 453), and whole genome copy number and mutation data for 171 cancer-related genes (n = 317). Association of rapid relapse with clinical and genomic features were assessed using Pearson chi-squared tests, t-tests, ANOVA, and Fisher exact tests. We evaluated logistic regression models of clinical features with subtype versus two models that integrated significant genomic features.

Results: Relative to nrTNBC, both rrTNBC and lrTNBC had significantly lower immune signatures and immune signatures were highly correlated to anti-tumor CD8 T-cell, M1 macrophage, and gamma-delta T-cell CIBERSORT inferred immune subsets. Intriguingly, lrTNBCs were enriched for luminal signatures. There was no difference in tumor mutation burden or percent genome altered across groups. Logistic regression mModels that incorporate genomic features significantly outperformed standard clinical/subtype models in training (n = 63 patients), testing (n = 63) and independent validation (n = 34) cohorts, although performance of all models were overall modest.

Conclusions: We identify clinical and genomic features associated with rapid relapse TNBC for further study of this aggressive TNBC subset.

Keywords: Breast Cancer; Machine learning; Triple-negative breast cancer.

MeSH terms

  • Adult
  • Biomarkers, Tumor / genetics*
  • Chemotherapy, Adjuvant / statistics & numerical data
  • DNA Copy Number Variations
  • Datasets as Topic
  • Disease-Free Survival
  • Female
  • Follow-Up Studies
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Logistic Models
  • Mastectomy*
  • Middle Aged
  • Models, Genetic
  • Mutation
  • Neoadjuvant Therapy / statistics & numerical data*
  • Neoplasm Recurrence, Local / epidemiology
  • Neoplasm Recurrence, Local / genetics*
  • Neoplasm Recurrence, Local / prevention & control
  • Prognosis
  • Risk Assessment / methods
  • Risk Assessment / statistics & numerical data
  • Time Factors
  • Triple Negative Breast Neoplasms / genetics
  • Triple Negative Breast Neoplasms / mortality
  • Triple Negative Breast Neoplasms / therapy*


  • Biomarkers, Tumor