Fast Bayesian Inference of Copy Number Variants using Hidden Markov Models with Wavelet Compression

PLoS Comput Biol. 2016 May 13;12(5):e1004871. doi: 10.1371/journal.pcbi.1004871. eCollection 2016 May.


By integrating Haar wavelets with Hidden Markov Models, we achieve drastically reduced running times for Bayesian inference using Forward-Backward Gibbs sampling. We show that this improves detection of genomic copy number variants (CNV) in array CGH experiments compared to the state-of-the-art, including standard Gibbs sampling. The method concentrates computational effort on chromosomal segments which are difficult to call, by dynamically and adaptively recomputing consecutive blocks of observations likely to share a copy number. This makes routine diagnostic use and re-analysis of legacy data collections feasible; to this end, we also propose an effective automatic prior. An open source software implementation of our method is available at (DOI: 10.5281/zenodo.46262). This paper was selected for oral presentation at RECOMB 2016, and an abstract is published in the conference proceedings.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Breast Neoplasms / genetics
  • Cell Line
  • Comparative Genomic Hybridization / statistics & numerical data*
  • Computational Biology
  • Computer Simulation
  • DNA Copy Number Variations*
  • Data Compression
  • Female
  • Genome, Human
  • Humans
  • Markov Chains
  • Models, Genetic*
  • Software