Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method

Mol Biol Evol. 2002 Jan;19(1):8-13. doi: 10.1093/oxfordjournals.molbev.a003985.

Abstract

Evolution of proteins is generally modeled as a Markov process acting on each site of the sequence. Replacement frequencies need to be estimated based on sequence alignments. Here we compare three approaches: First, the original method by Dayhoff, Schwartz, and Orcutt (1978) Atlas Protein Seq. Struc. 5:345-352, secondly, the resolvent method (RV) by Müller and Vingron (2000) J. Comput. Biol. 7(6):761-776, and finally a maximum likelihood approach (ML) developed in this paper. We evaluate the methods using a highly divergent and inhomogeneous set of sequence alignments as an input to the estimation procedure. ML is the method of choice for small sets of input data. Although the RV method is computationally much less demanding it performs only slightly worse than ML. Therefore, it is perfectly appropriate for large-scale applications.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acid Substitution*
  • Computer Simulation*
  • Evolution, Molecular*
  • Likelihood Functions
  • Markov Chains
  • Models, Genetic
  • Proteins / chemistry*
  • Sequence Alignment

Substances

  • Proteins