CollecTF: a database of experimentally validated transcription factor-binding sites in Bacteria

Nucleic Acids Res. 2014 Jan;42(Database issue):D156-60. doi: 10.1093/nar/gkt1123. Epub 2013 Nov 14.


The influx of high-throughput data and the need for complex models to describe the interaction of prokaryotic transcription factors (TF) with their target sites pose new challenges for TF-binding site databases. CollecTF ( compiles data on experimentally validated, naturally occurring TF-binding sites across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records. CollecTF integrates multiple sources of data automatically and openly, allowing users to dynamically redefine binding motifs and their experimental support base. Data quality and currency are fostered in CollecTF by adopting a sustainable model that encourages direct author submissions in combination with in-house validation and curation of published literature. CollecTF entries are periodically submitted to NCBI for integration into RefSeq complete genome records as link-out features, maximizing the visibility of the data and enriching the annotation of RefSeq files with regulatory information. Seeking to facilitate comparative genomics and machine-learning analyses of regulatory interactions, in its initial release CollecTF provides domain-wide coverage of two TF families (LexA and Fur), as well as extensive representation for a clinically important bacterial family, the Vibrionaceae.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteria / genetics*
  • Bacterial Proteins / metabolism*
  • Binding Sites
  • DNA, Bacterial / chemistry
  • DNA, Bacterial / metabolism*
  • Databases, Genetic*
  • Genome, Bacterial
  • Internet
  • Nucleic Acid Conformation
  • Position-Specific Scoring Matrices
  • Regulatory Elements, Transcriptional*
  • Serine Endopeptidases / metabolism
  • Transcription Factors / metabolism*


  • Bacterial Proteins
  • DNA, Bacterial
  • LexA protein, Bacteria
  • Transcription Factors
  • Serine Endopeptidases