Background: Sessile serrated adenomas/polyps are distinguished from hyperplastic colonic polyps subjectively by their endoscopic appearance and histological morphology. However, hyperplastic and sessile serrated polyps can have overlapping morphological features resulting in sessile serrated polyps diagnosed as hyperplastic. While sessile serrated polyps can progress into colon cancer, hyperplastic polyps have virtually no risk for colon cancer. Objective measures, differentiating these types of polyps would improve cancer prevention and treatment outcome.
Methods: RNA-seq training data set and Affimetrix, Illumina testing data sets were obtained from Gene Expression Omnibus (GEO). RNA-seq single-end reads were filtered with FastX toolkit. Read mapping to the human genome, gene abundance estimation, and differential expression analysis were performed with Tophat-Cufflinks pipeline. Background correction, normalization, and probe summarization steps for Affimetrix arrays were performed using the robust multi-array method (RMA). For Illumina arrays, log2-scale expression data was obtained from GEO. Pathway analysis was implemented using Bioconductor package GSAR. To build a platform-independent molecular classifier that accurately differentiates sessile serrated and hyperplastic polyps we developed a new feature selection step. We also developed a simple procedure to classify new samples as either sessile serrated or hyperplastic with a class probability assigned to the decision, estimated using Cantelli's inequality.
Results: The classifier trained on RNA-seq data and tested on two independent microarray data sets resulted in zero and three errors. The classifier was further tested using quantitative real-time PCR expression levels of 45 blinded independent formalin-fixed paraffin-embedded specimens and was highly accurate. Pathway analyses have shown that sessile serrated polyps are distinguished from hyperplastic polyps and normal controls by: up-regulation of pathways implicated in proliferation, inflammation, cell-cell adhesion and down-regulation of serine threonine kinase signaling pathway; differential co-expression of pathways regulating cell division, protein trafficking and kinase activities.
Conclusions: Most of the differentially expressed pathways are known as hallmarks of cancer and likely to explain why sessile serrated polyps are more prone to neoplastic transformation than hyperplastic. The new molecular classifier includes 13 genes and may facilitate objective differentiation between two polyps.
Keywords: Cantelli’s inequality; Feature selection; Formalin-fixed paraffin-embedded; Hyperplastic polyps; Microarrays; Molecular signature; RNA-seq; Sessile serrated adenoma/polys; Shrunken centroid classifier; Summary metric.