Nonnegative matrix factorization-based bioinformatics analysis reveals that TPX2 and SELENBP1 are two predictors of the inner sub-consensuses of lung adenocarcinoma

Cancer Med. 2021 Dec;10(24):9058-9077. doi: 10.1002/cam4.4386. Epub 2021 Nov 3.

Abstract

Background: Lung adenocarcinoma (LUAD) is a heterogeneous disease. However the inner sub-groups of LUAD have not been fully studied. Markers predicted the sub-groups and prognosis of LUAD are badly needed.

Aims: To identify biomarkers associated with the sub-groups and prognosis of LUAD.

Materials and methods: Using nonnegative matrix factorization (NMF) clustering, LUAD patients from The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO) datasets and LUAD cell lines from Genomics of Drug Sensitivity in Cancer (GDSC) dataset were divided into different sub-consensuses based on the gene expression profiling. The overall survival of LUAD patients in each sub-consensus was determined by Kaplan-Meier survival analysis. The common genes which were differentially expressed in each sub-consensus of LUAD patients and LUAD cell lines were identified using TBtools. The predictive accuracy of TPX2 and SELENBP1 for theinner sub-consensuses of LUAD was determined by Receiver operator characteristic (ROC) analysis. The Kaplan-Meier survival analysis was also used to test the prognostic significance of TPX2 and SELENBP1 in LUAD patients.

Results: Using nonnegative matrix factorization clustering, LUAD patients in The Cancer Genome Atlas (TCGA), GSE30219, GSE42127, GSE50081, GSE68465, and GSE72094 datasets were divided into three sub-consensuses. Sub-consensus3 LUAD patients were with low overall survival and were with high TP53 mutations. Similarly, LUAD cell lines were also divided into three sub-consensuses by NMF method, and sub-consensus2 cell lines were resistant to EGFR inhibitors. Identification of the common genes which were differentially expressed in different sub-consensuses of LUAD patients and LUAD cell lines revealed that TPX2 was highly expressed in sub-consensus3 LUAD patients and sub-consensus2 LUAD cell lines. On the contrary, SELENBP1 was highly expressed in sub-consensus1 LUAD patients and sub-consensus1 LUAD cell lines. The expression levels of TPX2 and SELENBP1 could distinguish sub-consensus3 LUAD patients or sub-consensus2 LUAD cell lines from other sub-consensuses of LUAD patients or cell lines. Moreover, compared with normal lung tissues, TPX2 was highly expressed, while, SELENBP1 was lowly expressed in LUAD tissues. Furthermore, the higher expression levels of TPX2 were associated with the lower relapse-free survival and the lower overall survival of LUAD patients. While, the higher expression levels of SELENBP1 were associated with the higher relapse-free survival and higher overall survival. At last, we showed that TP53 mutant LUAD patients were with higher TPX2 and lower SELENBP1 expressions.

Discussion: Both iCluster and NMF method are proved to be robust LUAD classification systems. However, the LUAD patients in different iclusters had no significant clinical overall survival, while, sub-consensus3 LUAD patients from NMF classification were with lower overall survival than other sub-consensuses.

Conclusions: By integrated analysis of 1765 LUAD patients and 64 LUAD cell lines, we showed that NMF was a robust inner sub-consensuses classification method of LUAD. TPX2 and SELENBP1 were differentially expressed in different LUAD sub- consensuses, and predicted the inner sub-consensuses of LUAD with high accuracy. TPX2 was an unfavorable prognostic biomarker of LUAD which was up-regulated in LUAD tissues and associated with the low overall survival of LUAD. SELENBP1 was a favorable prognostic biomarker of LUAD which was down-regulated in LUAD tissues and associated with the prolonged overall survival of LUAD.

Keywords: SELENBP1; TPX2; nonnegative matrix factorization; sub-consensus of lung adenocarcinoma.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma of Lung / genetics*
  • Adenocarcinoma of Lung / mortality
  • Adenocarcinoma of Lung / pathology
  • Cell Cycle Proteins / metabolism*
  • Computational Biology / methods*
  • Female
  • Humans
  • Lung Neoplasms / genetics*
  • Lung Neoplasms / mortality
  • Lung Neoplasms / pathology
  • Male
  • Microtubule-Associated Proteins / metabolism*
  • Mutation
  • Prognosis
  • Selenium-Binding Proteins / metabolism*
  • Survival Analysis

Substances

  • Cell Cycle Proteins
  • Microtubule-Associated Proteins
  • SELENBP1 protein, human
  • Selenium-Binding Proteins
  • TPX2 protein, human