Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods

J Zhang; M Nei

doi:10.1007/pl00000067

Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods

J Mol Evol. 1997:44 Suppl 1:S139-46. doi: 10.1007/pl00000067.

Authors

J Zhang¹, M Nei

Affiliation

¹ Institute of Molecular Evolutionary Genetics, Pennsylvania State University, Mueller Laboratory, University Park 16802, USA.

PMID: 9071022
DOI: 10.1007/pl00000067

Abstract

Information about protein sequences of ancestral organisms is important for identifying critical amino acid substitutions that have caused the functional change of proteins in evolution. Using computer simulation, we studied the accuracy of ancestral amino acids inferred by two currently available methods (maximum-parsimony [MP] and maximum-likelihood [ML] methods) in addition to a distance method, which was newly developed in this paper. All three methods give reliable inference when the divergence of amino acid sequences is low. When the extent of sequence divergence is high, however, the ML and distance methods give more accurate results than the MP method, particularly when the phylogenetic tree includes long branches. The accuracy of inferred ancestral amino acids does not change very much when a few present-day sequences are added or eliminated. When an incorrect model of amino acid substitution is used for the ML and distance methods, the accuracy decreases, but it is still higher than that for the MP method. When the tree topology used is partially incorrect, the accuracy in the correct part of the tree is virtually unaffected. The posterior probability of inferred ancestral amino acids computed by the ML and distance methods is an unbiased estimate of the true probability when a correct substitution model is used but may become an overestimate when a simpler model is used.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Amino Acid Sequence*
Computer Simulation*
Evolution, Molecular
Likelihood Functions
Phylogeny*
Proteins / genetics

Substances

Proteins