A New Statistical Model to Select Target Sequences Bound by Transcription Factors

Genome Inform. 2006;17(1):134-40.


Transcription factors (TFs) play a key role in gene regulation by binding to target sequences. In silico prediction of potential binding to a sequence is a main task in computational biology. Although many methods have been proposed to tackle this problem, the statistical significance of the prediction is still not solved. We propose an approach to give a good approximation for the potential of a sequence to be bound by a TF. Instead of assessing distinct binding sites, we motivate to focus on the number of binding sites. Based on a suitable statistical model, probabilities for scoring are approximated for a TF to bind to a sequence. Two examples show the necessity of such a model as well as the superiority of the proposed method compared to standard approaches.

Publication types

  • Comparative Study
  • Validation Study

MeSH terms

  • Animals
  • Binding Sites / genetics
  • DNA / genetics*
  • DNA / metabolism
  • Homeodomain Proteins / genetics
  • Homeodomain Proteins / metabolism
  • Humans
  • MADS Domain Proteins / genetics
  • MADS Domain Proteins / metabolism
  • MEF2 Transcription Factors
  • Mice
  • Models, Genetic*
  • Models, Statistical*
  • Myogenic Regulatory Factors / genetics
  • Myogenic Regulatory Factors / metabolism
  • Sequence Analysis, DNA*
  • Transcription Factors / genetics
  • Transcription Factors / metabolism*


  • Homeodomain Proteins
  • IRX4 protein, human
  • Irx4 protein, mouse
  • MADS Domain Proteins
  • MEF2 Transcription Factors
  • MEF2C protein, human
  • Mef2c protein, mouse
  • Myogenic Regulatory Factors
  • Transcription Factors
  • DNA