MULTIPLE TESTING OF LOCAL MAXIMA FOR DETECTION OF PEAKS IN CHIP-SEQ DATA

Ann Appl Stat. 2013;7(1):471-494. doi: 10.1214/12-aoas594.

Abstract

A topological multiple testing approach to peak detection is proposed for the problem of detecting transcription factor binding sites in ChIP-Seq data. After kernel smoothing of the tag counts over the genome, the presence of a peak is tested at each observed local maximum, followed by multiple testing correction at the desired false discovery rate level. Valid p-values for candidate peaks are computed via Monte Carlo simulations of smoothed Poisson sequences, whose background Poisson rates are obtained via linear regression from a Control sample at two different scales. The proposed method identifies nearby binding sites that other methods do not.

Keywords: Poisson sequence; false discovery rate; kernel smoothing; matched filter; topological inference.