Increased accuracy in analytical molecular distance estimation

Theor Popul Biol. 1998 Aug;54(1):78-90. doi: 10.1006/tpbi.1998.1362.

Abstract

Analytical molecular distance estimates can be inaccurate and biased estimates of the total number of substitutions not only when the model of evolution they are based on is incorrect, but also when the method of estimating the total is too simple. This comes about because when there are different types of substitutions occurring simultaneously, it can become extremely difficult to estimate the number of the more quickly evolving type, and the variance of this larger number can overwhelm the total estimate. In this paper, in an extension of earlier work with a simple two-parameter model of evolution, more accurate analytical distances are derived for models appropriate to a variety of known DNA types using generalized least squares principles of noise reduction. It is shown that the new estimates can be applied to achieve more accurate results for site-to-site rate variation, regions with biased nucleotide frequencies, and synonymous sites in protein-coding regions. This study also includes a methodology to obtain accurate distance estimates for large numbers of sequence regions evolving in different manners.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Biological Evolution*
  • DNA Mutational Analysis / statistics & numerical data*
  • Genetics, Population*
  • Humans
  • Least-Squares Analysis
  • Models, Genetic
  • Phylogeny