Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms

Proc Natl Acad Sci U S A. 2001 Feb 27;98(5):2497-502. doi: 10.1073/pnas.051611498. Epub 2001 Feb 20.

Abstract

When many protein sequences are available for estimating the time of divergence between two species, it is customary to estimate the time for each protein separately and then use the average for all proteins as the final estimate. However, it can be shown that this estimate generally has an upward bias, and that an unbiased estimate is obtained by using distances based on concatenated sequences. We have shown that two concatenation-based distances, i.e., average gamma distance weighted with sequence length (d(2)) and multiprotein gamma distance (d(3)), generally give more satisfactory results than other concatenation-based distances. Using these two distance measures for 104 protein sequences, we estimated the time of divergence between mice and rats to be approximately 33 million years ago. Similarly, the time of divergence between humans and rodents was estimated to be approximately 96 million years ago. We also investigated the dependency of time estimates on statistical methods and various assumptions made by using sequence data from eubacteria, protists, plants, fungi, and animals. Our best estimates of the times of divergence between eubacteria and eukaryotes, between protists and other eukaryotes, and between plants, fungi, and animals were 3, 1.7, and 1.3 billion years ago, respectively. However, estimates of ancient divergence times are subject to a substantial amount of error caused by uncertainty of the molecular clock, horizontal gene transfer, errors in sequence alignments, etc.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Evolution, Molecular*
  • Humans
  • Mice
  • Phylogeny
  • Proteins / genetics*
  • Rats

Substances

  • Proteins