Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study

PLoS One. 2019 Aug 22;14(8):e0221444. doi: 10.1371/journal.pone.0221444. eCollection 2019.


Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Colonic Neoplasms / genetics*
  • Colorectal Neoplasms / classification
  • Colorectal Neoplasms / genetics
  • Consensus*
  • Extracellular Matrix / metabolism
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks
  • Genes, Neoplasm*
  • Humans
  • Principal Component Analysis
  • Proteome / metabolism*
  • Signal Transduction / genetics
  • Transcriptome / genetics*


  • Proteome