Pervasiveness of gene conservation and persistence of duplicates in cellular genomes

J Mol Evol. 1999 Nov;49(5):591-600. doi: 10.1007/pl00006580.

Abstract

In this work detailed statistics on ancestral gene duplication and gene conservation in completely sequenced cellular genomes are presented. Analysis of open reading frame (ORF) products having simultaneous matches in several distinct organisms showed a significant correlation between duplication and conservation. Systematic comparisons of predicted proteomes of 23 organisms (including 20 that have been completely sequenced), have allowed us to quantify the degree of ancestral duplication within each genome and the level of conservation between genomes, using threshold values calculated for individual organisms. Statistical analysis of various gene proportions revealed interesting trends in gene structure and evolution, such as that (a) more than one-quarter (25%-66%) of the predicted ORF products of the surveyed organisms are not unique, indicating a high level of ancestral duplications; (b) levels of exclusive conservation within Bacteria are higher than those within the eukaryal or archaeal domains; and (c) at least one-half (47-99%) of the total predicted ORF products in the surveyed genomes have one or several highly significant matches in another genome. Significant matches are based on simulations taking into account the mean size of ORF products and the composition of each target organism's proteome. The methodology we have developed ensures stability and comparability of our results as the number of completely sequenced genomes increases.

Publication types

  • Comparative Study

MeSH terms

  • Base Sequence
  • Conserved Sequence
  • Evolution, Molecular*
  • Gene Duplication*
  • Genome*
  • Genome, Archaeal
  • Genome, Bacterial
  • Genome, Fungal
  • Open Reading Frames