Integrating Bacterial ChIP-seq and RNA-seq Data With SnakeChunks

Curr Protoc Bioinformatics. 2019 Jun;66(1):e72. doi: 10.1002/cpbi.72. Epub 2019 Feb 20.


Next-generation sequencing (NGS) is becoming a routine approach in most domains of the life sciences. To ensure reproducibility of results, there is a crucial need to improve the automation of NGS data processing and enable forthcoming studies relying on big datasets. Although user-friendly interfaces now exist, there remains a strong need for accessible solutions that allow experimental biologists to analyze and explore their results in an autonomous and flexible way. The protocols here describe a modular system that enable a user to compose and fine-tune workflows based on SnakeChunks, a library of rules for the Snakemake workflow engine. They are illustrated using a study combining ChIP-seq and RNA-seq to identify target genes of the global transcription factor FNR in Escherichia coli, which has the advantage that results can be compared with the most up-to-date collection of existing knowledge about transcriptional regulation in this model organism, extracted from the RegulonDB database. © 2019 by John Wiley & Sons, Inc.

Keywords: ChIP-seq; Escherichia coli K-12; FAIR Guiding Principles; RNA-seq; reproducible science; workflow.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / genetics*
  • Base Sequence
  • Chromatin Immunoprecipitation Sequencing / methods*
  • Genome, Bacterial
  • Nucleotide Motifs / genetics
  • RNA-Seq*
  • Software*
  • User-Computer Interface