The DNA-binding domain as a functional indicator: the case of the AraC/XylS family of transcription factors

Genetica. 2008 May;133(1):65-76. doi: 10.1007/s10709-007-9185-y. Epub 2007 Aug 22.


The AraC/XylS family of transcription factors, which include proteins that are involved in the regulation of diverse biological processes, has been of considerable interest recently and has been constantly expanding by means of in silico predictions and experimental analysis. In this work, using a HMM based on the DNA binding domain of 58 experimentally characterized proteins from the AraC/XylS (A/X), 1974 A/X proteins were found in 149 out of 212 bacterial genomes. This domain was used as a template to generate a phylogenetic tree and as a tool to predict the putative regulatory role of the new members of this family based on their proximity to a particular functional cluster in the tree. Based on this approach we assigned a functional regulatory role for 75% of the TFs dataset. Of these, 33.7% regulate genes involved in carbon-source catabolism, 9.6% global metabolism, 8.3% nitrogen metabolism, 2.9% adaptation responses, 8.9% stress responses, and 11.7% virulence. The abundance of TFs involved in the regulation of metabolic processes indicates that bacteria have optimized their regulatory systems to control energy uptake. In contrast, the lower percentage of TFs required for stress, adaptation and virulence regulation reflects the specialization acquired by each subset of TFs associated with those processes. This approach would be useful in assigning regulatory roles to uncharacterized members of other transcriptional factor families and it might facilitate their experimental analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • AraC Transcription Factor / chemistry*
  • AraC Transcription Factor / genetics
  • AraC Transcription Factor / metabolism*
  • Escherichia coli K12 / genetics
  • Escherichia coli K12 / metabolism
  • Escherichia coli O157 / genetics
  • Escherichia coli O157 / metabolism
  • Escherichia coli O157 / pathogenicity
  • Evolution, Molecular
  • Genome, Bacterial / genetics
  • Multigene Family / genetics*
  • Phylogeny
  • Protein Structure, Tertiary / genetics
  • Pseudomonas syringae / genetics
  • Pseudomonas syringae / metabolism
  • Pseudomonas syringae / pathogenicity
  • Virulence / genetics


  • AraC Transcription Factor