Gradient tree boosting and network propagation for the identification of pan-cancer survival networks

STAR Protoc. 2022 Apr 23;3(2):101353. doi: 10.1016/j.xpro.2022.101353. eCollection 2022 Jun 17.

Abstract

Cancer survival prediction is typically done with uninterpretable machine learning techniques, e.g., gradient tree boosting. Therefore, additional steps are needed to infer biological plausibility of the predictions. Here, we describe a protocol that combines pan-cancer survival prediction with XGBoost tree-ensemble learning and subsequent propagation of the learned feature weights on protein interaction networks. This protocol is based on TCGA transcriptome data of 8,024 patients from 25 cancer types but can easily be adapted to cancer patient data from other sources. For complete details on the use and execution of this protocol, please refer to Thedinga and Herwig (2022).

Keywords: Bioinformatics; Cancer; Genomics; Health Sciences; RNAseq; Systems biology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Machine Learning*
  • Neoplasms* / genetics
  • Protein Interaction Maps
  • Transcriptome