Computationally Tractable Multivariate HMM in Genome-Wide Mapping Studies

Hyungwon Choi; Debashis Ghosh; Zhaohui Qin

doi:10.1007/978-1-4939-6753-7_10

Computationally Tractable Multivariate HMM in Genome-Wide Mapping Studies

Methods Mol Biol. 2017:1552:135-148. doi: 10.1007/978-1-4939-6753-7_10.

Authors

Hyungwon Choi¹, Debashis Ghosh², Zhaohui Qin^{3

4}

Affiliations

¹ Saw Swee Hock School of Public Health, Tahir Foundation Building, National University of Singapore, Singapore, 117549, Singapore. hyung_won_choi@nuhs.edu.sg.
² Department of Biostatstics & Informatics, University of Colorado Anschutz Medical Campus, University Park, State College, PA, USA.
³ Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA.
⁴ Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA.

PMID: 28224496
DOI: 10.1007/978-1-4939-6753-7_10

Abstract

Hidden Markov model (HMM) is widely used for modeling spatially correlated genomic data (series data). In genomics, datasets of this kind are generated from genome-wide mapping studies through high-throughput methods such as chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq). When multiple regulatory protein binding sites or related epigenetic modifications are mapped simultaneously, the correlation between data series can be incorporated into the latent variable inference in a multivariate form of HMM, potentially increasing the statistical power of signal detection. In this chapter, we review the challenges of multivariate HMMs and propose a computationally tractable method called sparsely correlated HMMs (scHMM). We illustrate the method and the scHMM package using an example mouse ChIP-seq dataset.

Keywords: Genome-wide mapping study; Hidden Markov model.

MeSH terms

Algorithms
Animals
Binding Sites
Chromatin Immunoprecipitation / methods*
Chromosome Mapping / methods*
Computational Biology / methods*
Epigenesis, Genetic
Genome*
Genomics / methods*
Markov Chains*
Mice
Regulatory Sequences, Nucleic Acid
Transcription Factors / metabolism

Substances

Transcription Factors

Grants and funding

R01 GM072007/GM/NIGMS NIH HHS/United States