Making connections between novel transcription factors and their DNA motifs

Genome Res. 2005 Feb;15(2):312-20. doi: 10.1101/gr.3069205. Epub 2005 Jan 14.

Abstract

The key components of a transcriptional regulatory network are the connections between trans-acting transcription factors and cis-acting DNA-binding sites. In spite of several decades of intense research, only a fraction of the estimated approximately 300 transcription factors in Escherichia coli have been linked to some of their binding sites in the genome. In this paper, we present a computational method to connect novel transcription factors and DNA motifs in E. coli. Our method uses three types of mutually independent information, two of which are gleaned by comparative analysis of multiple genomes and the third one derived from similarities of transcription-factor-DNA-binding-site interactions. The different types of information are combined to calculate the probability of a given transcription-factor-DNA-motif pair being a true pair. Tested on a study set of transcription factors and their DNA motifs, our method has a prediction accuracy of 59% for the top predictions and 85% for the top three predictions. When applied to 99 novel transcription factors and 70 novel DNA motifs, our method predicted 64 transcription-factor-DNA-motif pairs. Supporting evidence for some of the predicted pairs is presented. Functional annotations are made for 23 novel transcription factors based on the predicted transcription-factor-DNA-motif connections.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Base Composition / genetics*
  • Binding Sites / genetics
  • DNA, Bacterial / genetics*
  • Genome, Bacterial
  • Gram-Negative Bacteria / cytology
  • Gram-Negative Bacteria / genetics
  • Gram-Negative Bacteria / metabolism
  • Peptides / genetics
  • Phylogeny
  • Predictive Value of Tests
  • Protein Binding / genetics
  • Protein Structure, Tertiary / genetics
  • Regulon / genetics
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism
  • Transcription, Genetic / genetics

Substances

  • DNA, Bacterial
  • Peptides
  • Transcription Factors