A structural-based strategy for recognition of transcription factor binding sites

PLoS One. 2013;8(1):e52460. doi: 10.1371/journal.pone.0052460. Epub 2013 Jan 8.

Abstract

Scanning through genomes for potential transcription factor binding sites (TFBSs) is becoming increasingly important in this post-genomic era. The position weight matrix (PWM) is the standard representation of TFBSs utilized when scanning through sequences for potential binding sites. However, many transcription factor (TF) motifs are short and highly degenerate, and methods utilizing PWMs to scan for sites are plagued by false positives. Furthermore, many important TFs do not have well-characterized PWMs, making identification of potential binding sites even more difficult. One approach to the identification of sites for these TFs has been to use the 3D structure of the TF to predict the DNA structure around the TF and then to generate a PWM from the predicted 3D complex structure. However, this approach is dependent on the similarity of the predicted structure to the native structure. We introduce here a novel approach to identify TFBSs utilizing structure information that can be applied to TFs without characterized PWMs, as long as a 3D complex structure (TF/DNA) exists. This approach utilizes an energy function that is uniquely trained on each structure. Our approach leads to increased prediction accuracy and robustness compared with those using a more general energy function. The software is freely available upon request.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Binding Sites
  • DNA, Fungal / chemistry
  • DNA, Fungal / genetics
  • DNA, Fungal / metabolism
  • Databases, Protein
  • Knowledge Bases
  • Models, Molecular
  • Nucleic Acid Conformation
  • Position-Specific Scoring Matrices
  • Protein Conformation
  • Saccharomyces cerevisiae Proteins / chemistry*
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism
  • Transcription Factors / chemistry*
  • Transcription Factors / genetics
  • Transcription Factors / metabolism

Substances

  • DNA, Fungal
  • Saccharomyces cerevisiae Proteins
  • Transcription Factors

Grant support

GL thanks the fundings supported by the National Sciences Foundation of China (no. 31070641) and National 973 Program of China (no. 2012CB721000) and start-up funding from SKLMRD and DICP, CAS (Chinese Academy of Sciences). The funders offered most of the costs of study design, data collection and analysis, decision to publish, or preparation of the manuscript.