GenomeHistory: a software tool and its application to fully sequenced genomes

Nucleic Acids Res. 2002 Aug 1;30(15):3378-86. doi: 10.1093/nar/gkf449.


We present a publicly available software tool ( that identifies all pairs of duplicate genes in a genome and then determines the degree of synonymous and non-synonymous divergence between each duplicate pair. Using this tool, we analyze the relations between (i) gene function and the propensity of a gene to duplicate and (ii) the number of genes in a gene family and the family's rate of sequence evolution. We do so for the complete genomes of four eukaryotes (fission and budding yeast, fruit fly and nematode) and one prokaryote (Escherichia coli). For some classes of genes we observe a strong relationship between gene function and a gene's propensity to undergo duplication. Most notably, ribosomal genes and transcription factors appear less likely to undergo gene duplication than other genes. In both fission and budding yeast, we see a strong positive correlation between the selective constraint on a gene and the size of the gene family of which this gene is a member. In contrast, a weakly negative such correlation is seen in multicellular eukaryotes.

Publication types

  • Evaluation Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Caenorhabditis elegans / genetics
  • Drosophila melanogaster / genetics
  • Escherichia coli / genetics
  • Evolution, Molecular
  • Gene Duplication
  • Genes, Duplicate* / physiology
  • Genome*
  • Genomics / methods*
  • Saccharomyces cerevisiae / genetics
  • Schizosaccharomyces / genetics
  • Sequence Analysis, DNA / methods*
  • Software*