General time-reversible distances with unequal rates across sites: mixing gamma and inverse Gaussian distributions with invariant sites

Mol Phylogenet Evol. 1997 Dec;8(3):398-414. doi: 10.1006/mpev.1997.0452.

Abstract

A series of new results useful to the study of DNA sequences using Markov models of substitution are presented with proofs. General time-reversible distances can be extended to accommodate any fixed distribution of rates across sites by replacing the logarithmic function of a matrix with the inverse of a moment generating function. Estimators are presented assuming a gamma distribution, the inverse Gaussian distribution, or a mixture of either of these with invariant sites. Also considered are the different ways invariant sites may be removed and how these differences may affect estimated distances. Through collaboration, we implemented these distances into PAUP in 1994. The variance of these new distances is approximated via the delta method. It is also shown how to predict the divergence expected for a pair of sequences given a rate matrix and a distribution of rates across sites, allowing iterated ML estimates of distances under any reversible model. A simple test of whether a rate matrix is time reversible is also presented. These new methods are used to estimate the divergence time of humans and chimps from mtDNA sequence data. These analyses support suggestions that the human lineage has an enhanced transition rate relative to other hominoids. These studies also show that transversion distances differ substantially from the overall distances which are dominated by transitions. Transversions alone apparently suggest a very recent divergence time for humans versus chimps and/or a very old (> 16 myr) divergence time for humans versus orangutans. This work illustrates graphically ways to interpret the reliability of distance-based transformations, using the corrected transition to transversion ratio returned for pairs of sequences which are successively more diverged.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • DNA, Mitochondrial / genetics
  • Humans
  • Markov Chains
  • Models, Genetic*
  • Primates / genetics

Substances

  • DNA, Mitochondrial