ColoType: a forty gene signature for consensus molecular subtyping of colorectal cancer tumors using whole-genome assay or targeted RNA-sequencing

Sci Rep. 2020 Jul 21;10(1):12123. doi: 10.1038/s41598-020-69083-y.


Colorectal cancer (CRC) tumors can be partitioned into four biologically distinct consensus molecular subtypes (CMS1-4) using gene expression. Evidence is accumulating that tumors in different subtypes are likely to respond differently to treatments. However, to date, there is no clinical diagnostic test for CMS subtyping. In this study, we used novel methodology in a multi-cohort training domain (n = 1,214) to develop the ColoType scores and classifier to predict CMS1-4 based on expression of 40 genes. In three validation cohorts (n = 1,744, in total) representing three distinct gene-expression measurement technologies, ColoType predicted gold-standard CMS subtypes with accuracies 0.90, 0.91, 0.88, respectively. To accommodate for potential intratumoral heterogeneity and tumors of mixed subtypes, ColoType was designed to report continuous scores measuring the prevalence of each of CMS1-4 in a tumor, in addition to specifying the most prevalent subtype. For analysis of clinical specimens, ColoType was also implemented with targeted RNA-sequencing (Illumina AmpliSeq). In a series of formalin-fixed, paraffin-embedded CRC samples (n = 49), ColoType by targeted RNA-sequencing agreed with subtypes predicted by two independent methods with accuracies 0.92, 0.82, respectively. With further validation, ColoType by targeted RNA-sequencing, may enable clinical application of CMS subtyping with widely-available and cost-effective technology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Algorithms
  • Colorectal Neoplasms / classification*
  • Colorectal Neoplasms / genetics
  • Consensus
  • Female
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks*
  • Humans
  • Male
  • Sequence Analysis, RNA
  • Whole Exome Sequencing