Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Mar 15;24(6):1248-1259.
doi: 10.1158/1078-0432.CCR-17-0853. Epub 2017 Oct 5.

Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer

Free PMC article

Deep Learning-Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer

Kumardeep Chaudhary et al. Clin Cancer Res. .
Free PMC article


Identifying robust survival subgroups of hepatocellular carcinoma (HCC) will significantly improve patient care. Currently, endeavor of integrating multi-omics data to explicitly predict HCC survival from multiple patient cohorts is lacking. To fill this gap, we present a deep learning (DL)-based model on HCC that robustly differentiates survival subpopulations of patients in six cohorts. We built the DL-based, survival-sensitive model on 360 HCC patients' data using RNA sequencing (RNA-Seq), miRNA sequencing (miRNA-Seq), and methylation data from The Cancer Genome Atlas (TCGA), which predicts prognosis as good as an alternative model where genomics and clinical data are both considered. This DL-based model provides two optimal subgroups of patients with significant survival differences (P = 7.13e-6) and good model fitness [concordance index (C-index) = 0.68]. More aggressive subtype is associated with frequent TP53 inactivation mutations, higher expression of stemness markers (KRT19 and EPCAM) and tumor marker BIRC5, and activated Wnt and Akt signaling pathways. We validated this multi-omics model on five external datasets of various omics types: LIRI-JP cohort (n = 230, C-index = 0.75), NCI cohort (n = 221, C-index = 0.67), Chinese cohort (n = 166, C-index = 0.69), E-TABM-36 cohort (n = 40, C-index = 0.77), and Hawaiian cohort (n = 27, C-index = 0.82). This is the first study to employ DL to identify multi-omics features linked to the differential survival of patients with HCC. Given its robustness over multiple cohorts, we expect this workflow to be useful at predicting HCC prognosis prediction. Clin Cancer Res; 24(6); 1248-59. ©2017 AACR.

Conflict of interest statement

Conflict of interest

The authors declared no conflict of interest.


Figure 1
Figure 1. Overall workflow
(A) Autoencoder architecture used to integrate 3 omics of HCC data. (B) Workflow combining deep learning and machine learning techniques to predict HCC survival subgroups. The workflow includes two steps. Step 1: inferring survival subgroups and Step 2: predicting risk labels for new samples. In step 1: mRNA, DNA methylation and miRNA features from TCGA HCC cohort are stacked up as input features for autoencoder, a deep learning method; then each of the new, transformed features in the bottle neck layer of autoencoder is then subject to single variate Cox-PH models, to select the features associated with survival; then K-mean clustering is applied to samples represented by these features, to identify survival-risk groups. In step 2, mRNA, methylation and miRNA input features are ranked by ANOVA test F-values, those features that are in common with the predicting dataset are selected, then top features are used to build SVM model(s) to predict the survival risk labels of new datasets.
Figure 2
Figure 2. Significant survival differences for TCGA and external confirmation cohorts
(A) TCGA cohort, (B) LIRI-JP cohort, (C) NCI cohort, (D) Chinese cohort, (E) E-TABM-36 cohort, and (F) Hawaiian cohort.
Figure 3
Figure 3. Differentially expressed genes and their enriched pathways in the two subtypes from TCGA cohort
S1: aggressive (higher-risk survival) subtype; S2: moderate (lower-risk survival) subtype.
Figure 4
Figure 4. Bipartite graph for significantly enriched KEGG pathways and upregulated genes in two subtype
Enriched pathway-gene analysis for upregulated genes in the (A) S1 aggressive tumor sub-group and (B) less aggressive S2 sub-group.

Similar articles

See all similar articles

Cited by 67 articles

See all "Cited by" articles

Publication types