Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 30;15(1):539.
doi: 10.1186/1471-2164-15-539.

ChIPseek, a web-based analysis tool for ChIP data

Affiliations

ChIPseek, a web-based analysis tool for ChIP data

Ting-Wen Chen et al. BMC Genomics. .

Abstract

Background: Chromatin is a dynamic but highly regulated structure. DNA-binding proteins such as transcription factors, epigenetic and chromatin modifiers are responsible for regulating specific gene expression pattern and may result in different phenotypes. To reveal the identity of the proteins associated with the specific region on DNA, chromatin immunoprecipitation (ChIP) is the most widely used technique. ChIP assay followed by next generation sequencing (ChIP-seq) or microarray (ChIP-chip) is often used to study patterns of protein-binding profiles in different cell types and in cancer samples on a genome-wide scale. However, only a limited number of bioinformatics tools are available for ChIP datasets analysis.

Results: We present ChIPseek, a web-based tool for ChIP data analysis providing summary statistics in graphs and offering several commonly demanded analyses. ChIPseek can provide statistical summary of the dataset including histogram of peak length distribution, histogram of distances to the nearest transcription start site (TSS), and pie chart (or bar chart) of genomic locations for users to have a comprehensive view on the dataset for further analysis. For examining the potential functions of peaks, ChIPseek provides peak annotation, visualization of peak genomic location, motif identification, sequence extraction, and comparison between datasets. Beyond that, ChIPseek also offers users the flexibility to filter peaks and re-analyze the filtered subset of peaks. ChIPseek supports 20 different genome assemblies for 12 model organisms including human, mouse, rat, worm, fly, frog, zebrafish, chicken, yeast, fission yeast, Arabidopsis, and rice. We use demo datasets to demonstrate the usage and intuitive user interface of ChIPseek.

Conclusions: ChIPseek provides a user-friendly interface for biologists to analyze large-scale ChIP data without requiring any programing skills. All the results and figures produced by ChIPseek can be downloaded for further analysis. The analysis tools built into ChIPseek, especially the ones for selecting and examine a subset of peaks from ChIP data, provides invaluable helps for exploring the high through-put data from either ChIP-seq or ChIP-chip. ChIPseek is freely available at http://chipseek.cgu.edu.tw.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Annotation results for binding sites of ATF2, ATF3, ETS1 and GATA1. ChIPseek separates annotation tables into separate tabs for each upload file. In each tab, ChIPseek shows annotation results for each chromosome in an annotation table. Here, a partial annotation result for binding sites of ATF3 in chromosome 13 is shown. For this TF, the total number of peaks (23,095 peaks) is shown above the annotation table followed by a link for downloading the full annotation table (highlighted by the red box). Within the table, each column shows the location of peaks, genomics location annotation, distance to the nearest TSS, nearest RefSeq, gene name etc. The user can click on the title of each column to sort that column. In the table, words in light blue are hyperlinks leading to external databases or a genome browser, i.e., NCBI RefSeq database, UniProt database and the UCSC genome browser, for each peak. User may also specify the regions of interest and visit those particular regions using the text boxes above the annotation table (highlighted by the blue box).
Figure 2
Figure 2
Pie chart and bar chart of genomic location distribution. (A) Pie chart of the genomic location distribution of transcription factor ATF2. This plot shows the percentage for each genomic location category. The categories are sorted by descending percentage. The exact percentage for each category appears if the mouse pointer hovers over a pie slice. (B) Bar chart of the genomic location distribution of four transcription factors, ETS1, ATF2, ATF3 and GATA1. All uploaded files are combined into the same bar chart. This bar chart reveals the actual number of peaks for each category when the mouse pointer hovers over each bar.
Figure 3
Figure 3
Histogram of distance to the nearest TSS and of peak lengths. (A) The distribution of distance to the nearest TSS. This example is of the transcription factor binding sites for ATF2. The x-axis of the histogram is centered at 0 and divided into 100 bins that cover the largest and smallest values of distance. As shown in this histogram, most of the binding sites are located near the TSS. The exact number of peaks for each range of distance appears when the mouse pointer hovers over that bar. (B) The distribution of the peak lengths of transcription factor binding sites of ATF2. Most of the peaks have a length smaller than 600 bp. Again, if users are interested in the exact number of peaks within each range, hovering over that range will reveal the value. (C) The user may use filter criteria to select a subset of peaks. There are two ways to filter the peaks (highlighted by the red boxes). The first is the slider bar and the second is the text box. In this example, we use text boxes to filter out peaks with distance to the nearest TSS larger than +200 or smaller than -200. After this operation, 2,847 peaks are left. After the selection step, the histogram is refreshed with this subset of peaks in real time. After this filter step, we save the filtered subset with the “save” button above the histogram.
Figure 4
Figure 4
Comparison between binding sites of ATF2 and ATF3 and Venn diagram. After selecting ATF2_-200-200_TSS and ATF3_-200-200_TSS for comparison, ChIPseek compares peaks from these two datasets. The overall comparison result is shown as a Venn diagram in the first tab. As shown here, a total of 1,359 peaks are ATF2 unique, 2,499 peaks are ATF3 unique and 1,499 peak pairs are overlapped between the biding sites of ATF2 and ATF3. The detailed peak information for unique peaks and overlapped peak pairs can be found in the following three tabs.
Figure 5
Figure 5
Overlapped peak pairs. At the top of this page is shown how many peak pairs are found, and a link is provided to download all peak pairs with their annotation. All overlapped peak pairs are separated into different tables according to their location. As shown here, the table lists peak pairs located on chromosome X. The first four columns list the start and end positions of peaks. The fifth column shows the relative positions of the peak pairs. The last column provides links to the UCSC genome browser for that region of interest. Clicking on the title of each column can sort that column.
Figure 6
Figure 6
Chromosome ideograms of CTCF. Peaks are plotted on the chromosome ideograms at their positions. The information of exact position and nearest gene will appear if the mouse pointer hovers over the peak. Clicking on the peak will link to the UCSC genome browser. There are 10 different colors available for selection: blue, pink, green, yellow, gold, purple, aqua, fuchsia, silver, and red. The user may plot different datasets on the same ideogram with different colors. To plot additional peaks, the user can clear all marks with the “clear all peaks” button.
Figure 7
Figure 7
Motif identification result for binding site of CTCF. Here is the result of motif enrichment analysis for ±25 bp around the center of CTCF binding sites. The identified motifs are sorted according to their p-values. Clicking on “more information” will display the details for that enriched motif. The original HOMER prediction result is available from the hyperlink provided above the table.

Similar articles

Cited by

References

    1. Solomon MJ, Larsen PL, Varshavsky A. Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell. 1988;53(6):937–947. doi: 10.1016/S0092-8674(88)90469-2. - DOI - PubMed
    1. Kuo MH, Allis CD. In vivo cross-linking and immunoprecipitation for studying dynamic Protein:DNA associations in a chromatin environment. Methods. 1999;19(3):425–433. doi: 10.1006/meth.1999.0879. - DOI - PubMed
    1. Weinmann AS, Farnham PJ. Identification of unknown target genes of human transcription factors using chromatin immunoprecipitation. Methods. 2002;26(1):37–47. doi: 10.1016/S1046-2023(02)00006-3. - DOI - PubMed
    1. Weinmann AS, Yan PS, Oberley MJ, Huang TH-M, Farnham PJ. Isolating human transcription factor targets by coupling chromatin immunoprecipitation and CpG island microarray analysis. Genes Dev. 2002;16(2):235–244. doi: 10.1101/gad.943102. - DOI - PMC - PubMed
    1. Lee SH, Kim J, Kim WH, Lee YM. Hypoxic silencing of tumor suppressor RUNX3 by histone modification in gastric cancer cells. Oncogene. 2009;28(2):184–194. doi: 10.1038/onc.2008.377. - DOI - PubMed

Publication types