19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology

Mathieu Fourment; Andrew F Magee; Chris Whidden; Arman Bilge; Frederick A Matsen; Vladimir N Minin

doi:10.1093/sysbio/syz046

19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology

Syst Biol. 2020 Mar 1;69(2):209-220. doi: 10.1093/sysbio/syz046.

Authors

Mathieu Fourment¹, Andrew F Magee², Chris Whidden³, Arman Bilge³, Frederick A Matsen³, Vladimir N Minin⁴

Affiliations

¹ University of Technology Sydney, ithree Institute, Ultimo NSW 2007, Australia.
² Department of Biology, University of Washington, Seattle, WA 98195, USA.
³ Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
⁴ Department of Statistics, University of California, Irvine, CA 92697, USA.

Abstract

The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here, we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real data sets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.

Keywords: Bayesian inference; evidence; importance sampling; model selection; variational Bayes.

19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding