Estimating binding properties of transcription factors from genome-wide binding profiles

Nucleic Acids Res. 2015 Jan;43(1):84-94. doi: 10.1093/nar/gku1269. Epub 2014 Nov 28.

Abstract

The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, we propose an analytical model to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in the case of eukaryotes), the number of TF molecules expected to be bound specifically to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in the form of ChIP-seq profiles, we backwards inferred copy number and specificity for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. Our results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that whilst Bicoid and Caudal display a higher specificity, the other three TFs (Giant, Hunchback and Kruppel) display lower specificity in their binding (despite having PWMs with higher information content). This study gives further weight to earlier investigations into TF copy numbers that suggest a significant proportion of molecules are not bound specifically to the DNA.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites
  • Cell Nucleus / metabolism
  • DNA / metabolism
  • Drosophila melanogaster / embryology
  • Drosophila melanogaster / genetics
  • Genomics
  • Position-Specific Scoring Matrices
  • Protein Binding
  • Regulatory Elements, Transcriptional*
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors
  • DNA