Universal trees based on large combined protein sequence data sets

Nat Genet. 2001 Jul;28(3):281-5. doi: 10.1038/90129.

Abstract

Universal trees of life based on small-subunit (SSU) ribosomal RNA (rRNA) support the separate mono/holophyly of the domains Archaea (archaebacteria), Bacteria (eubacteria) and Eucarya (eukaryotes) and the placement of extreme thermophiles at the base of the Bacteria. The concept of universal tree reconstruction recently has been upset by protein trees that show intermixing of species from different domains. Such tree topologies have been attributed to either extensive horizontal gene transfer or degradation of phylogenetic signals because of saturation for amino acid substitutions. Here we use large combined alignments of 23 orthologous proteins conserved across 45 species from all domains to construct highly robust universal trees. Although individual protein trees are variable in their support of domain integrity, trees based on combined protein data sets strongly support separate monophyletic domains. Within the Bacteria, we placed spirochaetes as the earliest derived bacterial group. However, elimination from the combined protein alignment of nine protein data sets, which were likely candidates for horizontal gene transfer, resulted in trees showing thermophiles as the earliest evolved bacterial lineage. Thus, combined protein universal trees are highly congruent with SSU rRNA trees in their strong support for the separate monophyly of domains as well as the early evolution of thermophilic Bacteria.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Archaea / genetics
  • Bacteria / genetics
  • Conserved Sequence
  • Databases, Factual
  • Eukaryotic Cells
  • Evolution, Molecular*
  • Genomics*
  • Phylogeny*
  • Sequence Alignment
  • Sequence Analysis, Protein / methods*