Optimal point process filtering and estimation of the coalescent process

J Theor Biol. 2017 May 21:421:153-167. doi: 10.1016/j.jtbi.2017.04.001. Epub 2017 Apr 3.

Abstract

The coalescent process is a widely used approach for inferring the demographic history of a population, from samples of its genetic diversity. Several parametric and non-parametric coalescent inference methods, involving Markov chain Monte Carlo, Gaussian processes, and other algorithms, already exist. However, these techniques are not always easy to adapt and apply, thus creating a need for alternative methodologies. We introduce the Bayesian Snyder filter as an easily implementable and flexible minimum mean square error estimator for parametric demographic functions on fixed genealogies. By reinterpreting the coalescent as a self-exciting Markov process, we show that the Snyder filter can be applied to both isochronously and heterochronously sampled datasets. We analytically solve the filter equations for the constant population size Kingman coalescent, derive expressions for its mean squared estimation error, and estimate its robustness to prior distribution specification. For populations with deterministically time-varying size we numerically solve the Snyder equations, and test this solution on common demographic models. We find that the Snyder filter accurately recovers the true demographic history for these models. We also apply the filter to a well-studied, dataset of hepatitis C virus sequences and show that the filter compares well to a popular phylodynamic inference method. The Snyder filter is an exact (given discretised priors, it does not approximate the posterior) and direct Bayesian estimation method that has the potential to become a useful alternative tool for coalescent inference.

Keywords: Bayesian inference; Coalescent theory; Non-linear filters; Parametric estimation; Snyder filters.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Bayes Theorem*
  • Genetic Variation*
  • Genetics, Population
  • Hepacivirus / genetics
  • Markov Chains*
  • Population Density