Analyzing gene coexpression data by an evolutionary model

Genome Inform. 2010:24:154-63.


Coexpressed genes are tentatively translated into proteins that are involved in similar biological functions. Here, we constructed gene coexpression networks from collected microarray data of the organisms Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli. Their degree distributions show the common property of an overrepresentation of highly connected nodes followed by a sudden truncation. In order to analyze this behavior, we present an evolutionary model simulating the genetic evolution. This model assumes that new genes emerge by duplication from a small initial set of primordial genes. Our model does not include the removal of unused genes but selective pressure is indirectly taken into account by preferentially duplicating the old genes. Thus, gene duplication represents the emergence of a new gene and its successful establishment. After a duplication event, all genes are slightly but iteratively mutated, thus altering their expression patterns. Our model is capable of reproducing global properties of the investigated coexpression networks. We show that our model reflects the mean inter-node distances and especially the characteristic humps in the degree distribution that, in the biological examples, result from functionally related genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Arabidopsis / genetics*
  • Computational Biology / methods
  • Computer Simulation
  • Escherichia coli / genetics*
  • Evolution, Molecular*
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • Models, Genetic
  • Mutation
  • Oligonucleotide Array Sequence Analysis
  • Probability
  • Saccharomyces cerevisiae / genetics*