Probability Steiner trees and maximum parsimony in phylogenetic analysis

J Math Biol. 2012 Jun;64(7):1225-51. doi: 10.1007/s00285-011-0442-4. Epub 2011 Jun 25.

Abstract

The phylogenetic tree (PT) problem has been studied by a number of researchers as an application of the Steiner tree problem, a well-known network optimisation problem. Of all the methods developed for phylogenies the maximum parsimony (MP) method is a simple and commonly used method because it relies on directly observable changes in the input nucleotide or amino acid sequences. In this paper we show that the non-uniqueness of the evolutionary pathways in the MP method leads us to consider a new model of PTs. In this so-called probability representation model, for each site a node in a PT is modelled by a probability distribution of nucleotide or amino acid states, and hence the PT at a given site is a probability Steiner tree, i.e. a Steiner tree in a high-dimensional vector space. In spite of the generality of the probability representation model, in this paper we restrict our study to constructing probability phylogenetic trees (PPT) using the parsimony criterion, as well as discussing and comparing our approach with the classical MP method. We show that for a given input set although the optimal topology as well as the total tree length of the PPT is the same as the PT constructed by the classical MP method, the inferred ancestral states and branch lengths are different and the results given by our method provide a plausible alternative to the classical ones.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Evolution, Molecular
  • Models, Genetic
  • Models, Statistical*
  • Molecular Sequence Data
  • Phylogeny*
  • Probability