Optimizing Survival Analysis of XGBoost for Ties to Predict Disease Progression of Breast Cancer

IEEE Trans Biomed Eng. 2021 Jan;68(1):148-160. doi: 10.1109/TBME.2020.2993278. Epub 2020 Dec 21.

Abstract

Objective: Some excellent prognostic models based on survival analysis methods for breast cancer have been proposed and extensively validated, which provide an essential means for clinical diagnosis and treatment to improve patient survival. To analyze clinical and follow-up data of 12119 breast cancer patients, derived from the Clinical Research Center for Breast (CRCB) in West China Hospital of Sichuan University, we developed a gradient boosting algorithm, called EXSA, by optimizing survival analysis of XGBoost framework for ties to predict the disease progression of breast cancer.

Methods: EXSA is based on the XGBoost framework in machine learning and the Cox proportional hazards model in survival analysis. By taking Efron approximation of partial likelihood function as a learning objective for ties, EXSA derives gradient formulas of a more precise approximation. It optimizes and enhances the ability of XGBoost for survival data with ties. After retaining 4575 patients (3202 cases for training, 1373 cases for test), we exploit the developed EXSA method to build an excellent prognostic model to estimate disease progress. Risk score of disease progress is evaluated by the model, and the risk grouping and continuous functions between risk scores and disease progress rate at 5- and 10-year are also demonstrated.

Results: Experimental results on test set show that the EXSA method achieves competitive performance with concordance index of 0.83454, 5-year and 10-year AUC of 0.83851 and 0.78155, respectively.

Conclusion: The proposed EXSA method can be utilized as an effective method for survival analysis.

Significance: The proposed method in this paper can provide an important means for follow-up data of breast cancer or other disease research.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms* / diagnosis
  • Disease Progression
  • Female
  • Humans
  • Machine Learning
  • Reproducibility of Results
  • Survival Analysis