Constructing disease-specific gene networks using pair-wise relevance metric: application to colon cancer identifies interleukin 8, desmin and enolase 1 as the central elements

BMC Syst Biol. 2008 Aug 10;2:72. doi: 10.1186/1752-0509-2-72.


Background: With the advance of large-scale omics technologies, it is now feasible to reversely engineer the underlying genetic networks that describe the complex interplays of molecular elements that lead to complex diseases. Current networking approaches are mainly focusing on building genetic networks at large without probing the interaction mechanisms specific to a physiological or disease condition. The aim of this study was thus to develop such a novel networking approach based on the relevance concept, which is ideal to reveal integrative effects of multiple genes in the underlying genetic circuit for complex diseases.

Results: The approach started with identification of multiple disease pathways, called a gene forest, in which the genes extracted from the decision forest constructed by supervised learning of the genome-wide transcriptional profiles for patients and normal samples. Based on the newly identified disease mechanisms, a novel pair-wise relevance metric, adjusted frequency value, was used to define the degree of genetic relationship between two molecular determinants. We applied the proposed method to analyze a publicly available microarray dataset for colon cancer. The results demonstrated that the colon cancer-specific gene network captured the most important genetic interactions in several cellular processes, such as proliferation, apoptosis, differentiation, mitogenesis and immunity, which are known to be pivotal for tumourigenesis. Further analysis of the topological architecture of the network identified three known hub cancer genes [interleukin 8 (IL8) (p approximately 0), desmin (DES) (p = 2.71 x 10(-6)) and enolase 1 (ENO1) (p = 4.19 x 10(-5))], while two novel hub genes [RNA binding motif protein 9 (RBM9) (p = 1.50 x 10(-4)) and ribosomal protein L30 (RPL30) (p = 1.50 x 10(-4))] may define new central elements in the gene network specific to colon cancer. Gene Ontology (GO) based analysis of the colon cancer-specific gene network and the sub-network that consisted of three-way gene interactions suggested that tumourigenesis in colon cancer resulted from dysfunction in protein biosynthesis and categories associated with ribonucleoprotein complex which are well supported by multiple lines of experimental evidence.

Conclusion: This study demonstrated that IL8, DES and ENO1 act as the central elements in colon cancer susceptibility, and protein biosynthesis and the ribosome-associated function categories largely account for the colon cancer tumuorigenesis. Thus, the newly developed relevancy-based networking approach offers a powerful means to reverse-engineer the disease-specific network, a promising tool for systematic dissection of complex diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Cell Line, Tumor
  • Colonic Neoplasms / genetics*
  • DNA-Binding Proteins / genetics*
  • Desmin / genetics*
  • Gene Regulatory Networks*
  • Humans
  • Interleukin-8 / genetics*
  • Oligonucleotide Array Sequence Analysis
  • Phosphopyruvate Hydratase / genetics*
  • Tumor Suppressor Proteins / genetics*


  • Biomarkers, Tumor
  • DNA-Binding Proteins
  • Desmin
  • Interleukin-8
  • Tumor Suppressor Proteins
  • ENO1 protein, human
  • Phosphopyruvate Hydratase