Predicting the ancestral character changes in a tree is typically easier than predicting the root state

Syst Biol. 2014 May;63(3):421-35. doi: 10.1093/sysbio/syu010. Epub 2014 Feb 21.

Abstract

Predicting the ancestral sequences of a group of homologous sequences related by a phylogenetic tree has been the subject of many studies, and numerous methods have been proposed for this purpose. Theoretical results are available that show that when the substitution rates become too large, reconstructing the ancestral state at the tree root is no longer feasible. Here, we also study the reconstruction of the ancestral changes that occurred along the tree edges. We show that, that, depending on the tree and branch length distribution, reconstructing these changes (i.e., reconstructing the ancestral state of all internal nodes in the tree) may be easier or harder than reconstructing the ancestral root state. However, results from information theory indicate that for the standard Yule tree, the task of reconstructing internal node states remains feasible, even for very high substitution rates. Moreover, computer simulations demonstrate that for more complex trees and scenarios, this result still holds. For a large variety of counting, parsimony- and likelihood-based methods, the predictive accuracy of a randomly selected internal node in the tree is indeed much higher than the accuracy of the same method when applied to the tree root. Moreover, parsimony- and likelihood-based methods appear to be remarkably robust to sampling bias and model mis-specification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Classification*
  • Computer Simulation
  • Likelihood Functions
  • Models, Theoretical*
  • Phylogeny*
  • Probability