Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan:75:103750.
doi: 10.1016/j.ebiom.2021.103750. Epub 2021 Dec 15.

Integrative analysis from multi-center studies identities a consensus machine learning-derived lncRNA signature for stage II/III colorectal cancer

Affiliations

Integrative analysis from multi-center studies identities a consensus machine learning-derived lncRNA signature for stage II/III colorectal cancer

Zaoqu Liu et al. EBioMedicine. 2022 Jan.

Abstract

Background: Long non-coding RNAs (lncRNAs) have recently emerged as essential biomarkers of cancer progression. However, studies are limited regarding lncRNAs correlated with recurrence and fluorouracil-based adjuvant chemotherapy (ACT) in stage II/III colorectal cancer (CRC).

Methods: 1640 stage II/III CRC patients were enrolled from 15 independent datasets and a clinical in-house cohort. 10 prevalent machine learning algorithms were collected and then combined into 76 combinations. 109 published transcriptome signatures were also retrieved. qRT-PCR assay was performed to verify our model.

Findings: We comprehensively identified 27 stably recurrence-related lncRNAs from multi-center cohorts. According to these lncRNAs, a consensus machine learning-derived lncRNA signature (CMDLncS) that exhibited best power for predicting recurrence risk was determined from 76 kinds of algorithm combinations. A high CMDLncS indicated unfavorable recurrence and mortality rates. CMDLncS not only could work independently of common clinical traits (e.g., AJCC stage) and molecular features (e.g., microsatellite state, KRAS mutation), but also presented dramatically better performance than these variables. qRT-PCR results from 173 patients further verified our in-silico findings and assessed its feasible in different centers. Comparisons of CMDLncS with 109 published transcriptome signatures further demonstrated its predictive superiority. Additionally, patients with high CMDLncS benefited more from fluorouracil-based ACT and were characterized by activation of stromal and epithelial-mesenchymal transition, while patients with low CMDLncS suggested the sensitivity to bevacizumab and displayed enhanced immune activation.

Interpretation: CMDLncS provides an attractive platform for identifying patient at high risk of recurrence and could optimize precision treatment to improve the clinical outcomes in stage II/III CRC.

Funding: This study was supported by the National Natural Science Foundation of China (81,972,663); Henan Province Young and Middle-Aged Health Science and Technology Innovation Talent Project (YXKC2020037); and Henan Provincial Health Commission Joint Youth Project (SB201902014).

Keywords: Chemotherapy; LncRNA; Machine learning; Recurrence; Stage II/III colorectal cancer.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no competing interests.

Figures

Fig 1
Figure 1
The overall flow of this study.
Fig 2
Figure 2
Integrative construction of a consensus signature. a. C-indices of 76 kinds of prediction models in seven validation cohorts. b. Determination of the optimal lambda was obtained when the partial likelihood deviance reached the minimum value, and further generated the key lncRNAs with nonzero coefficients. c. LASSO coefficient profiles of the candidate lncRNAs for CMDLncS construction. d. Coefficients of 12 lncRNAs finally obtained in stepwise Cox regression.
Fig 3
Figure 3
Kaplan-Meier survival analysis of CMDLncS. a–i. Kaplan-Meier curves of RFS according to the CMDLncS in GSE39582 (a), TCGA-CRC (b), GSE143985 (c), GSE161158 (d), GSE17536 (e), GSE29621 (f), GSE31595 (g), GSE92921(h), and meta-cohort (i). j–n. Kaplan-Meier curves of OS according to the CMDLncS in GSE39582 (j), TCGA-CRC (k), GSE17536 (l), GSE29621 (m), and meta-cohort (n).
Fig 4
Figure 4
Robust performance of CMDLncS. a. Time-dependent ROC analysis for predicting RFS at 1, 3, and 5 years. b. C-indices of CMDLncS across all datasets. c–j. The performance of CMDLncS was compared with common clinical and molecular variables in predicting prognosis across all training and validation cohorts. Z-score test: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.
Fig 5
Figure 5
Comparisons of gene expression signatures. a. Univariate Cox regression analysis of CMDLncS and 109 published signatures. b. C-indices of CMDLncS and 109 published signatures in GSE39582, TCGA-CRC, GSE143985, GSE161158, GSE17536, GSE29621, GSE31595, GSE92921, and meta-cohort. Z-score test: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.
Fig 6
Figure 6
Verification of CMDLncS via qRT-PCR. a, b. Kaplan-Meier curves of RFS (a) and OS (b) according to CMDLncS. c, d. Multivariable Cox regression analysis of RFS (c) and OS (d). e. Time-dependent ROC analysis for predicting RFS at 1, 3, and 5 years. f. The performance of CMDLncS was compared with common clinical and molecular variables in predicting prognosis. Z-score test: **P < 0.01, ***P < 0.001, ****P < 0.0001.
Fig 7
Figure 7
Predictive value of fluorouracil-based ACT and bevacizumab benefits. a. Distributions of CMDLncS between responders and nonresponders of fluorouracil-based ACT. b. ROC curves of CMDLncS to predict the benefits of fluorouracil-based ACT. c. Distributions of CMDLncS between responders and nonresponders of bevacizumab. d. ROC curves of CMDLncS to predict the benefits of bevacizumab. e. Distribution of CMDLncS between responders and nonresponders regarding fluorouracil-based ACT and bevacizumab in our in-house cohort, respectively. f. ROC curves regarding fluorouracil-based ACT and bevacizumab in our in-house cohort, respectively. T-test: nsP > 0.05, *P < 0.05, ***P < 0.001, ****P < 0.0001.
Fig 8
Figure 8
Biological mechanisms underlying CMDLncS. a. Top 20 pathways that were positively and negatively correlated with CMDLncS. b–e. Correlations of CMDLncS with EMT (b), stromal (c), immune (d), and TIS scores (e).
Fig 9
Figure 9
Immune landscape of CMDLncS. a. Heatmap of 28 immune cells infiltration. b. Relationship between CMDLncS and immune cell infiltrations. c. Distributions of 28 immune cells infiltration between high- and low-risk groups. T-test: nsP > 0.05, *P < 0.05, ***P < 0.001, ****P < 0.0001.

Similar articles

Cited by

References

    1. Sung H., Ferlay J., Siegel R.L., Laversanne M., Soerjomataram I., Jemal A., et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. - PubMed
    1. Koncina E., Haan S., Rauh S., Letellier E. Prognostic and predictive molecular biomarkers for colorectal cancer: updates and challenges. Cancers (Basel) 2020;12(2):319. - PMC - PubMed
    1. Punt C.J., Koopman M., Vermeulen L. From tumour heterogeneity to advances in precision treatment of colorectal cancer. Nat Rev Clin Oncol. 2017;14(4):235–246. - PubMed
    1. Auclin E., Zaanan A., Vernerey D., Douard R., Gallois C., Laurent-Puig P., et al. Subgroups and prognostication in stage III colon cancer: future perspectives for adjuvant therapy. Ann Oncol. 2017;28(5):958–968. - PubMed
    1. Varghese A. Chemotherapy for stage II colon cancer. Clin Colon Rectal Surg. 2015;28(4):256–261. - PMC - PubMed