Background: Transcription factors (TFs) are DNA-binding proteins that regulate gene expression by activating or repressing transcription. Some have housekeeping roles, while others regulate the expression of specific genes in response to environmental change. The majority of TFs are multi-domain proteins, and they can be divided into families according to their domain organisation. There is a need for user-friendly, rigorous and consistent databases to allow researchers to overcome the inherent variability in annotation between genome sequences.
Description: P2TF (Predicted Prokaryotic Transcription Factors) is an integrated and comprehensive database relating to transcription factor proteins. The current version of the database contains 372,877 TFs from 1,987 completely sequenced prokaryotic genomes and 43 metagenomes. The database provides annotation, classification and visualisation of TF genes and their genetic context, providing researchers with a one-stop shop in which to investigate TFs. The P2TF database analyses TFs in both predicted proteomes and reconstituted ORFeomes, recovering approximately 3% more TF proteins than just screening predicted proteomes. Users are able to search the database with sequence or domain architecture queries, and resulting hits can be aligned to investigate evolutionary relationships and conservation of residues. To increase utility, all searches can be filtered by taxonomy, TF genes can be added to the P2TF cart, and gene lists can be exported for external analysis in a variety of formats.
Conclusions: P2TF is an open resource for biologists, allowing exploration of all TFs within prokaryotic genomes and metagenomes. The database enables a variety of analyses, and results are presented for user exploration as an interactive web interface, which provides different ways to access and download the data. The database is freely available at http://www.p2tf.org/.