Reverse engineering molecular regulatory networks from microarray data with qp-graphs

J Comput Biol. 2009 Feb;16(2):213-27. doi: 10.1089/cmb.2008.08TT.

Abstract

Reverse engineering bioinformatic procedures applied to high-throughput experimental data have become instrumental in generating new hypotheses about molecular regulatory mechanisms. This has been particularly the case for gene expression microarray data, where a large number of statistical and computational methodologies have been developed in order to assist in building network models of transcriptional regulation. A major challenge faced by every different procedure is that the number of available samples n for estimating the network model is much smaller than the number of genes p forming the system under study. This compromises many of the assumptions on which the statistics of the methods rely, often leading to unstable performance figures. In this work, we apply a recently developed novel methodology based in the so-called q-order limited partial correlation graphs, qp-graphs, which is specifically tailored towards molecular network discovery from microarray expression data with p >> n. Using experimental and functional annotation data from Escherichia coli, here we show how qp-graphs yield more stable performance figures than other state-of-the-art methods when the ratio of genes to experiments exceeds one order of magnitude. More importantly, we also show that the better performance of the qp-graph method on such a gene-to-sample ratio has a decisive impact on the functional coherence of the reverse-engineered transcriptional regulatory modules and becomes crucial in such a challenging situation in order to enable the discovery of a network of reasonable confidence that includes a substantial number of genes relevant to the essayed conditions. An R package, called qpgraph implementing this method is part of the Bioconductor project and can be downloaded from (www.bioconductor.org). A parallel standalone version for the most computationally expensive calculations is available from (http://functionalgenomics.upf.xsedu/qpgraph).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Computer Simulation*
  • Escherichia coli / genetics
  • Escherichia coli / metabolism
  • Gene Regulatory Networks*
  • Metabolic Networks and Pathways
  • Models, Biological*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Software