Recovering evolutionary trees under a more realistic model of sequence evolution

Mol Biol Evol. 1994 Jul;11(4):605-12. doi: 10.1093/oxfordjournals.molbev.a040136.


We report a new transformation, the LogDet, that is consistent for sequences with differing nucleotide composition and that have arisen under simple but asymmetric stochastic models of evolution. This transformation is required because existing methods tend to group sequences on the basis of their nucleotide composition, irrespective of their evolutionary history. This effect of differing nucleotide frequencies is illustrated by using a tree-selection criterion on a simple distance measure defined solely on the basis of base composition, independent of the actual sequences. The new LogDet transformation uses determinants of the observed divergence matrices and works because multiplication of determinants (real numbers) is commutative, whereas multiplication of matrices is not,except in special symmetric cases. The use of determinants thus allows more general models of evolution with a symmetric rates of nucleotide change. The transformation is illustrated on a theoretical data set (where existing methods select the wrong tree) and with three biological data sets: chloroplasts, birds/mammals (nuclear), and honeybees ( mitochondrial ) . The LogDet transformation reinforces the logical distinction between transformations on the data and tree-selection criteria. The overall conclusions from this study are that irregular A,C,G,T compositions are an important and possible general cause of patterns that can mislead tree-reconstruction methods, even when high bootstrap values are obtained. Consequently, many published studies may need to be reexamined.

MeSH terms

  • Animals
  • Base Sequence*
  • Bees / genetics
  • DNA, Mitochondrial / genetics
  • Evolution, Molecular*
  • Humans
  • Models, Genetic*
  • Models, Statistical
  • Phylogeny*
  • RNA, Ribosomal, 18S / genetics


  • DNA, Mitochondrial
  • RNA, Ribosomal, 18S