A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs

Nat Protoc. 2012 Jul 26;7(8):1551-68. doi: 10.1038/nprot.2012.088.


This protocol explains how to use the online integrated pipeline 'peak-motifs' (http://rsat.ulb.ac.be/rsat/) to predict motifs and binding sites in full-size peak sets obtained by chromatin immunoprecipitation-sequencing (ChIP-seq) or related technologies. The workflow combines four time- and memory-efficient motif discovery algorithms to extract significant motifs from the sequences. Discovered motifs are compared with databases of known motifs to identify potentially bound transcription factors. Sequences are scanned to predict transcription factor binding sites and analyze their enrichment and positional distribution relative to peak centers. Peaks and binding sites are exported as BED tracks that can be uploaded into the University of California Santa Cruz (UCSC) genome browser for visualization in the genomic context. This protocol is illustrated with the analysis of a set of 6,000 peaks (8 Mb in total) bound by the Drosophila transcription factor Krüppel. The complete workflow is achieved in about 25 min of computational time on the Regulatory Sequence Analysis Tools (RSAT) Web server. This protocol can be followed in about 1 h.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Binding Sites
  • Chromatin Immunoprecipitation / methods*
  • Drosophila Proteins / genetics
  • Drosophila melanogaster / embryology
  • Drosophila melanogaster / genetics
  • Embryo, Nonmammalian
  • Genomics / methods*
  • Kruppel-Like Transcription Factors / genetics
  • Mice
  • Nucleotide Motifs*
  • Sequence Analysis, DNA / methods*
  • Software*
  • Time Factors
  • Transcription Factors / genetics
  • Transcription Factors / metabolism
  • Workflow*


  • Drosophila Proteins
  • Kr protein, Drosophila
  • Kruppel-Like Transcription Factors
  • Transcription Factors

Associated data

  • GEO/GSM439463
  • GEO/GSM511084
  • GEO/GSM559652