An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences

BMC Evol Biol. 2005 Jun 2;5:36. doi: 10.1186/1471-2148-5-36.


Background: The concept of a genomic core, defined as the set of genes ubiquitous in all genomes of a monophyletic group, has become crucial in comparative and evolutionary genomics. However, it is still a matter of debate whether lateral gene transfers (LGT) may affect the components of genomic cores, preventing their use to retrace species evolution. We have recently reconstructed the phylogeny of Archaea by using two large concatenated datasets of core proteins involved in translation and transcription, respectively. The resulting trees were largely congruent, showing that informational gene components of the archaeal genomic core belonging to two distinct molecular systems contain a coherent signal for archaeal phylogeny. However, some incongruence remained between the two phylogenies. This may be due either to undetected LGT and/or to a lack of sufficient phylogenetic signal in the datasets.

Results: We present evidence strongly favoring of the latter hypothesis. In fact, we have updated our transcription and translation datasets with five new archaeal genomes for a total of 6384 and 2928 amino acid positions, respectively, and 25 taxa. This increase in taxonomic sampling led to the nearly complete convergence of the transcription-based and translation-based trees on a single phylogenetic pattern for archaeal evolution. In fact, only a single incongruence persisted between the two phylogenies. This concerned Methanopyrus kandleri, whose placement remained strongly biased in the transcription tree due to its above average evolutionary rates, and could not be counterbalanced due to the lack of availability of closely related and/or slower-evolving relatives.

Conclusion: To our knowledge, this is the first report of evidence that the phylogenetic signal harbored by components of the archaeal translation apparatus is confirmed by additional markers belonging to a second molecular system (i.e. transcription). This rules out the risk of circularity when inferring species evolution by small subunit ribosomal RNA and ribosomal protein sequences, since it has been suggested that concerted LGT may affect these markers. Our results strongly support the existence of a core of proteins that has evolved mainly through vertical inheritance in Archaea, and carries a bona fide phylogenetic signal that can be used to retrace the evolutionary history of this domain. The identification and analysis of additional molecular markers not affected by LGT should continue defining the emerging picture of a genuine phylogenetic core for the third domain of life.

MeSH terms

  • Algorithms
  • Animals
  • Archaea / genetics*
  • Computational Biology / methods*
  • Evolution, Molecular
  • Genes, Archaeal
  • Genome, Archaeal*
  • Likelihood Functions
  • Phylogeny
  • Protein Biosynthesis*
  • Protein Structure, Tertiary
  • RNA, Ribosomal / genetics
  • Software
  • Transcription, Genetic*


  • RNA, Ribosomal