Walking pathways with positive feedback loops reveal DNA methylation biomarkers of colorectal cancer

BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):119. doi: 10.1186/s12859-019-2687-7.

Abstract

Background: The search for molecular biomarkers of early-onset colorectal cancer (CRC) is an important but still quite challenging and unsolved task. Detection of CpG methylation in human DNA obtained from blood or stool has been proposed as a promising approach to a noninvasive early diagnosis of CRC. Thousands of abnormally methylated CpG positions in CRC genomes are often located in non-coding parts of genes. Novel bioinformatic methods are thus urgently needed for multi-omics data analysis to reveal causative biomarkers with a potential driver role in early stages of cancer.

Methods: We have developed a method for finding potential causal relationships between epigenetic changes (DNA methylations) in gene regulatory regions that affect transcription factor binding sites (TFBS) and gene expression changes. This method also considers the topology of the involved signal transduction pathways and searches for positive feedback loops that may cause the carcinogenic aberrations in gene expression. We call this method "Walking pathways", since it searches for potential rewiring mechanisms in cancer pathways due to dynamic changes in the DNA methylation status of important gene regulatory regions ("epigenomic walking").

Results: In this paper, we analysed an extensive collection of full genome gene-expression data (RNA-seq) and DNA methylation data of genomic CpG islands (using Illumina methylation arrays) generated from a sample of tumor and normal gut epithelial tissues of 300 patients with colorectal cancer (at different stages of the disease) (data generated in the EU-supported SysCol project). Identification of potential epigenetic biomarkers of DNA methylation was performed using the fully automatic multi-omics analysis web service "My Genome Enhancer" (MGE) (my-genome-enhancer.com). MGE uses the database on gene regulation TRANSFAC®, the signal transduction pathways database TRANSPATH®, and software that employs AI (artificial intelligence) methods for the analysis of cancer-specific enhancers.

Conclusions: The identified biomarkers underwent experimental testing on an independent set of blood samples from patients with colorectal cancer. As a result, using advanced methods of statistics and machine learning, a minimum set of 6 biomarkers was selected, which together achieve the best cancer detection potential. The markers include hypermethylated positions in regulatory regions of the following genes: CALCA, ENO1, MYC, PDX1, TCF7, ZNF43.

Keywords: Circulating DNA; Colorectal cancer; DNA methylation; Genetic algorithm; Multi-omics analysis; Prognostic biomarkers; Signal transduction; Transcription factor binding sites.

MeSH terms

  • Binding Sites / genetics
  • Biomarkers, Tumor / genetics*
  • Colorectal Neoplasms / diagnosis
  • Colorectal Neoplasms / genetics*
  • Colorectal Neoplasms / pathology
  • CpG Islands / genetics
  • DNA Methylation / genetics*
  • Epigenesis, Genetic
  • Feedback, Physiological*
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Male
  • Middle Aged
  • Neoplasm Staging
  • Signal Transduction / genetics*
  • Transcription Factors / metabolism

Substances

  • Biomarkers, Tumor
  • Transcription Factors