Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 5;45(8):e58.
doi: 10.1093/nar/gkw1319.

HMCan-diff: A Method to Detect Changes in Histone Modifications in Cells With Different Genetic Characteristics

Affiliations
Free PMC article

HMCan-diff: A Method to Detect Changes in Histone Modifications in Cells With Different Genetic Characteristics

Haitham Ashoor et al. Nucleic Acids Res. .
Free PMC article

Abstract

Comparing histone modification profiles between cancer and normal states, or across different tumor samples, can provide insights into understanding cancer initiation, progression and response to therapy. ChIP-seq histone modification data of cancer samples are distorted by copy number variation innate to any cancer cell. We present HMCan-diff, the first method designed to analyze ChIP-seq data to detect changes in histone modifications between two cancer samples of different genetic backgrounds, or between a cancer sample and a normal control. HMCan-diff explicitly corrects for copy number bias, and for other biases in the ChIP-seq data, which significantly improves prediction accuracy compared to methods that do not consider such corrections. On in silico simulated ChIP-seq data generated using genomes with differences in copy number profiles, HMCan-diff shows a much better performance compared to other methods that have no correction for copy number bias. Additionally, we benchmarked HMCan-diff on four experimental datasets, characterizing two histone marks in two different scenarios. We correlated changes in histone modifications between a cancer and a normal control sample with changes in gene expression. On all experimental datasets, HMCan-diff demonstrated better performance compared to the other methods.

Figures

Figure 1.
Figure 1.
A workflow illustrating HMCan-diff steps. Initially, HMCan-diff constructs a fragment density profile for each provided ChIP-seq or input dataset. Then, it normalizes density profiles of each replicate in each condition for several types of bias, specifically for copy number variation, library size, GC-content bias and noise level. After that, HMCan-diff conducts additional normalization to eliminate further technical variation between conditions. It initializes HMM parameters based on the data. In particular, HMCan-diff defines the HMM emission probability distribution as the joint empirical distribution of normalized density values. Then, HMCan-diff improves these parameters using the Baum-Welch algorithm, and finishes by dividing genomic regions into three states: C1 (enriched in condition 1), C2 (enriched in condition 2), and the ‘no difference’ state.
Figure 2.
Figure 2.
Precision-recall curves for HMCan-diff and other methods on simulated data. (A) Precision-recall curves on data simulated without copy number bias: HMCan-diff is slightly better than the majority of tools. (B) Precision-recall curves on the simulated data with copy number bias: HMCan-diff shows significantly better performance than the other methods.
Figure 3.
Figure 3.
Effects of the use of replicates on HMCan-diff predictions. (A) When using replicate information, HMCan-diff produces better predictions than when using pooled data. (B) Genome browser view showing that combining data from different replicates may lead to losing the correct differential signal, while the use of information from replicates improves HMCan-diff prediction accuracy.
Figure 4.
Figure 4.
Regardless of copy number status, gene expression changes correlate better with HMCan-diff predictions than with predictions generated by the other methods using the H3K27me3 histone mark. Cumulative values of S corresponding to the top differential peaks when comparing A549 vs. normal lung tissue (A), and MCF7 vs. HMEC (B). Cumulative values of S grouped by copy number state (neutral, gain and loss): A549 versus normal lung tissue (C), and MCF7 vs. HMEC (D).

Similar articles

See all similar articles

Cited by 3 articles

References

    1. Johnson D.S., Mortazavi A., Myers R.M., Wold B. Genome-wide mapping of in vivo protein–DNA interactions. Science. 2007; 316:1497–1502. - PubMed
    1. The ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012; 489:57–74. - PMC - PubMed
    1. Roadmap Epigenomics C., Kundaje A., Meuleman W., Ernst J., Bilenky M., Yen A., Heravi-Moussavi A., Kheradpour P., Zhang Z., Wang J. et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015; 518:317–330. - PMC - PubMed
    1. Jones P.A., Baylin S.B. The epigenomics of cancer. Cell. 2007; 128:683–692. - PMC - PubMed
    1. Berdasco M., Ropero S., Setien F., Fraga M.F., Lapunzina P., Losson R., Alaminos M., Cheung N.K., Rahman N., Esteller M. Epigenetic inactivation of the Sotos overgrowth syndrome gene histone methyltransferase NSD1 in human neuroblastoma and glioma. Proc. Natl. Acad. Sci. U.S.A. 2009; 106:21830–21835. - PMC - PubMed

Publication types

Feedback