Identification of commonly dysregulated genes in colorectal cancer by integrating analysis of RNA-Seq data and qRT-PCR validation

Cancer Gene Ther. 2015 May;22(5):278-84. doi: 10.1038/cgt.2015.20. Epub 2015 Apr 24.

Abstract

The progression of colorectal cancer (CRC) is a multistep process and metastatic CRC is always incurable; consequently, CRC is the leading cause of cancer-related deaths. There is therefore an urgent need for identifying useful biomarkers with enough sensitivity and specificity to detect this disease at early stages, which will significantly reduce the mortality for this malignancy. In this study, we performed an integrating analysis of different RNA-Seq data sets to find new candidate biomarkers for diagnosis, prognosis and as therapeutic targets for this malignancy, as well as to elucidate the molecular mechanisms of CRC carcinogenesis. We identified 883 differentially expressed genes (DEGs) across the studies between CRC and normal control (NC) tissues by combining five RNA-Seq data sets. Gene function analysis revealed high correlation with carcinogenesis. The top 10 most significantly DEGs were further evaluated by quantitative real-time polymerase chain reaction (qRT-PCR) in both rectal cancer (RC) and colon cancer (CC), and the results matched well with integrating data, suggesting that the method of integrating analysis of different RNA-seq data sets is acceptable. Therefore, integrating analysis of different RNA-seq data sets may be a useful way to overcome the limitation of small sample size in a single RNA-seq study. In addition, our study showed that some genes, such as SIM2, ADAMTS6, FOXD4L4 and DNAH5, may have an important role in the development of CRC, which could be applied for diagnosis, prognosis and as therapy for this malignancy. Our findings would also help to understand the pathology of CRC.

MeSH terms

  • Adult
  • Aged
  • Colorectal Neoplasms / genetics*
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Male
  • RNA / genetics*
  • Real-Time Polymerase Chain Reaction / methods
  • Real-Time Polymerase Chain Reaction / standards
  • Reproducibility of Results

Substances

  • RNA