An information theoretic exploratory method for learning patterns of conditional gene coexpression from microarray data

IEEE/ACM Trans Comput Biol Bioinform. 2008 Jan-Mar;5(1):15-24. doi: 10.1109/TCBB.2007.1056.

Abstract

In this article, we introduce an exploratory framework for learning patterns of conditional co-expression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of M nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is non-parametric and it is based on the concept of statistical co-information, which, unlike conventional correlation based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional co-expression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pair-wise relationships are considered. A moment based approximation of the co-information measure is derived that efficiently gets around the problem of estimating high-dimensional multi-variate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional co-expression. A selection of such interactions that carry a meaningful biological interpretation are discussed.

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Computational Biology / methods
  • Gene Expression Profiling*
  • Gene Expression Regulation, Fungal
  • Internet
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods*
  • Saccharomyces cerevisiae Proteins / genetics*
  • Software
  • Statistics, Nonparametric

Substances

  • Saccharomyces cerevisiae Proteins