Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules

Nat Protoc. 2008;3(10):1578-88. doi: 10.1038/nprot.2008.97.

Abstract

This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites / genetics*
  • Computational Biology / methods*
  • Conserved Sequence / genetics
  • Genetic Techniques*
  • Genomics / methods*
  • Oligonucleotides / genetics
  • Regulatory Elements, Transcriptional / genetics*
  • Software*
  • Transcription Factors / genetics*

Substances

  • Oligonucleotides
  • Transcription Factors