Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin

Mol Biol Evol. 2010 Jun;27(6):1449-59. doi: 10.1093/molbev/msq033. Epub 2010 Feb 1.


Archaea, which represent a large fraction of the phylogenetic diversity of organisms, are prokaryotes with eukaryote-like basal transcriptional machinery. This organization makes the study of their DNA-binding transcription factors (TFs) and their transcriptional regulatory networks particularly interesting. In addition, there are limited experimental data regarding their TFs. In this work, 3,918 TFs were identified and exhaustively analyzed in 52 archaeal genomes. TFs represented less than 5% of the gene products in all the studied species comparable with the number of TFs identified in parasites or intracellular pathogenic bacteria, suggesting a deficit in this class of proteins. A total of 75 families were identified, of which HTH_3, AsnC, TrmB, and ArsR families were universally and abundantly identified in all the archaeal genomes. We found that archaeal TFs are significantly small compared with other protein-coding genes in archaea as well as bacterial TFs, suggesting that a large fraction of these small-sized TFs could supply the probable deficit of TFs in archaea, by possibly forming different combinations of monomers similar to that observed in eukaryotic transcriptional machinery. Our results show that although the DNA-binding domains of archaeal TFs are similar to bacteria, there is an underrepresentation of ligand-binding domains in smaller TFs, which suggests that protein-protein interactions may act as mediators of regulatory feedback, indicating a chimera of bacterial and eukaryotic TFs' functionality. The analysis presented here contributes to the understanding of the details of transcriptional apparatus in archaea and provides a framework for the analysis of regulatory networks in these organisms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Binding Sites
  • Cluster Analysis
  • Evolution, Molecular*
  • Gene Expression Regulation, Archaeal / genetics*
  • Genome, Archaeal*
  • Genome, Bacterial
  • Genomics / methods*
  • Markov Chains
  • Phylogeny
  • Protein Structure, Tertiary
  • Transcription Factors / chemistry
  • Transcription Factors / genetics*


  • Transcription Factors