A varying threshold method for ChIP peak-calling using multiple sources of information

Bioinformatics. 2010 Sep 15;26(18):i504-10. doi: 10.1093/bioinformatics/btq379.

Abstract

Motivation: Gene regulation commonly involves interaction among DNA, proteins and biochemical conditions. Using chromatin immunoprecipitation (ChIP) technologies, protein-DNA interactions are routinely detected in the genome scale. Computational methods that detect weak protein-binding signals and simultaneously maintain a high specificity yet remain to be challenging. An attractive approach is to incorporate biologically relevant data, such as protein co-occupancy, to improve the power of protein-binding detection. We call the additional data related with the target protein binding as supporting tracks.

Results: We propose a novel but rigorous statistical method to identify protein occupancy in ChIP data using multiple supporting tracks (PASS2). We demonstrate that utilizing biologically related information can significantly increase the discovery of true protein-binding sites, while still maintaining a desired level of false positive calls. Applying the method to GATA1 restoration in mouse erythroid cell line, we detected many new GATA1-binding sites using GATA1 co-occupancy data.

Availability: http://stat.psu.edu/ approximately yuzhang/pass2.tar.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Animals
  • Binding Sites
  • Cell Line
  • Chromatin Immunoprecipitation / methods*
  • Computer Simulation
  • DNA / metabolism
  • GATA1 Transcription Factor / metabolism*
  • Mathematical Computing
  • Mice
  • Protein Binding

Substances

  • GATA1 Transcription Factor
  • Gata1 protein, mouse
  • DNA