Reliabilities of identifying positive selection by the branch-site and the site-prediction methods

Proc Natl Acad Sci U S A. 2009 Apr 21;106(16):6700-5. doi: 10.1073/pnas.0901855106. Epub 2009 Apr 1.

Abstract

Natural selection operating in protein-coding genes is often studied by examining the ratio (omega) of the rates of nonsynonymous to synonymous nucleotide substitution. The branch-site method (BSM) based on a likelihood ratio test is one of such tests to detect positive selection for a predetermined branch of a phylogenetic tree. However, because the number of nucleotide substitutions involved is often very small, we conducted a computer simulation to examine the reliability of BSM in comparison with the small-sample method (SSM) based on Fisher's exact test. The results indicate that BSM often generates false positives compared with SSM when the number of nucleotide substitutions is approximately 80 or smaller. Because the omega value is also used for predicting positively selected sites, we examined the reliabilities of the site-prediction methods, using nucleotide sequence data for the dim-light and color vision genes in vertebrates. The results showed that the site-prediction methods have a low probability of identifying functional changes of amino acids experimentally determined and often falsely identify other sites where amino acid substitutions are unlikely to be important. This low rate of predictability occurs because most of the current statistical methods are designed to identify codon sites with high omega values, which may not have anything to do with functional changes. The codon sites showing functional changes generally do not show a high omega value. To understand adaptive evolution, some form of experimental confirmation is necessary.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Color Vision / genetics
  • Computer Simulation
  • False Positive Reactions
  • Models, Statistical*
  • Phylogeny
  • Primates / genetics
  • Reproducibility of Results
  • Selection, Genetic*