Background: Colon cancer has been classically described by clinicopathologic features that permit the prediction of outcome only after surgical resection and staging.
Methods: We performed an unsupervised analysis of microarray data from 326 colon cancers to identify the first principal component (PC1) of the most variable set of genes. PC1 deciphered two primary, intrinsic molecular subtypes of colon cancer that predicted disease progression and recurrence.
Results: Here we report that the most dominant pattern of intrinsic gene expression in colon cancer (PC1) was tightly correlated (Pearson R = 0.92, P < 10(-135)) with the EMT signature-- both in gene identity and directionality. In a global micro-RNA screen, we further identified the most anti-correlated microRNA with PC1 as MiR200, known to regulate EMT.
Conclusions: These data demonstrate that the biology underpinning the native, molecular classification of human colon cancer--previously thought to be highly heterogeneous-- was clarified through the lens of comprehensive transcriptome analysis.