Unveiling breast cancer risk profiles: a survival clustering analysis empowered by an online web application

Future Oncol. 2023 Dec;19(40):2651-2667. doi: 10.2217/fon-2023-0736. Epub 2023 Dec 14.


Aim: To develop a shiny app for doctors to investigate breast cancer treatments through a new approach by incorporating unsupervised clustering and survival information. Materials & methods: Analysis is based on the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset, which contains 1726 subjects and 22 variables. Cox regression was used to identify survival risk factors for K-means clustering. Logrank tests and C-statistics were compared across different cluster numbers and Kaplan-Meier plots were presented. Results & conclusion: Our study fills an existing void by introducing a unique combination of unsupervised learning techniques and survival information on the clinician side, demonstrating the potential of survival clustering as a valuable tool in uncovering hidden structures based on distinct risk profiles.

Keywords: Cox regression; K-means clustering; Kaplan–Meier curve; breast cancer; cancer risk profiles; machine learning; shiny; survival; unsupervised learning; web-based application.

MeSH terms

  • Breast Neoplasms* / epidemiology
  • Breast Neoplasms* / genetics
  • Breast Neoplasms* / therapy
  • Cluster Analysis
  • Female
  • Humans
  • Risk Factors
  • Survival Analysis