Similarities and differences in genome-wide expression data of six organisms

PLoS Biol. 2004 Jan;2(1):E9. doi: 10.1371/journal.pbio.0020009. Epub 2003 Dec 15.


Comparing genomic properties of different organisms is of fundamental importance in the study of biological and evolutionary principles. Although differences among organisms are often attributed to differential gene expression, genome-wide comparative analysis thus far has been based primarily on genomic sequence information. We present a comparative study of large datasets of expression profiles from six evolutionarily distant organisms: S. cerevisiae, C. elegans, E. coli, A. thaliana, D. melanogaster, and H. sapiens. We use genomic sequence information to connect these data and compare global and modular properties of the transcription programs. Linking genes whose expression profiles are similar, we find that for all organisms the connectivity distribution follows a power-law, highly connected genes tend to be essential and conserved, and the expression program is highly modular. We reveal the modular structure by decomposing each set of expression data into coexpressed modules. Functionally related sets of genes are frequently coexpressed in multiple organisms. Yet their relative importance to the transcription program and their regulatory relationships vary among organisms. Our results demonstrate the potential of combining sequence and expression data for improving functional gene annotation and expanding our understanding of how gene expression and diversity evolved.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Arabidopsis / genetics
  • Caenorhabditis elegans / genetics
  • Cluster Analysis
  • Databases, Genetic
  • Drosophila melanogaster / genetics
  • Escherichia coli / genetics
  • Evolution, Molecular
  • Gene Deletion
  • Gene Expression Profiling
  • Genes, Plant
  • Genome*
  • Genome, Bacterial
  • Genome, Fungal
  • Genome, Human
  • Genomics / methods*
  • Humans
  • Internet
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis
  • Open Reading Frames
  • RNA Interference
  • Saccharomyces cerevisiae / genetics
  • Sequence Analysis, DNA
  • Species Specificity
  • Transcription, Genetic