Correlating traits of gene retention, sequence divergence, duplicability and essentiality in vertebrates, arthropods, and fungi

Genome Biol Evol. 2011:3:75-86. doi: 10.1093/gbe/evq083. Epub 2010 Dec 9.


Delineating ancestral gene relations among a large set of sequenced eukaryotic genomes allowed us to rigorously examine links between evolutionary and functional traits. We classified 86% of over 1.36 million protein-coding genes from 40 vertebrates, 23 arthropods, and 32 fungi into orthologous groups and linked over 90% of them to Gene Ontology or InterPro annotations. Quantifying properties of ortholog phyletic retention, copy-number variation, and sequence conservation, we examined correlations with gene essentiality and functional traits. More than half of vertebrate, arthropod, and fungal orthologs are universally present across each lineage. These universal orthologs are preferentially distributed in groups with almost all single-copy or all multicopy genes, and sequence evolution of the predominantly single-copy orthologous groups is markedly more constrained. Essential genes from representative model organisms, Mus musculus, Drosophila melanogaster, and Saccharomyces cerevisiae, are significantly enriched in universal orthologs within each lineage, and essential-gene-containing groups consistently exhibit greater sequence conservation than those without. This study of eukaryotic gene repertoire evolution identifies shared fundamental principles and highlights lineage-specific features, it also confirms that essential genes are highly retained and conclusively supports the "knockout-rate prediction" of stronger constraints on essential gene sequence evolution. However, the distinction between sequence conservation of single- versus multicopy orthologs is quantitatively more prominent than between orthologous groups with and without essential genes. The previously underappreciated difference in the tolerance of gene duplications and contrasting evolutionary modes of "single-copy control" versus "multicopy license" may reflect a major evolutionary mechanism that allows extended exploration of gene sequence space.

Publication types

  • Comparative Study

MeSH terms

  • Animals
  • Arthropods / classification
  • Arthropods / genetics*
  • Computational Biology
  • Evolution, Molecular*
  • Fungi / classification
  • Fungi / genetics*
  • Gene Duplication*
  • Genes, Essential*
  • Genome
  • Phylogeny
  • Proteome
  • Quantitative Trait Loci
  • Vertebrates / classification
  • Vertebrates / genetics*


  • Proteome