Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov;28(11):3033-43.
doi: 10.1093/molbev/msr125. Epub 2011 Jun 13.

A random effects branch-site model for detecting episodic diversifying selection

Affiliations

A random effects branch-site model for detecting episodic diversifying selection

Sergei L Kosakovsky Pond et al. Mol Biol Evol. 2011 Nov.

Abstract

Adaptive evolution frequently occurs in episodic bursts, localized to a few sites in a gene, and to a small number of lineages in a phylogenetic tree. A popular class of "branch-site" evolutionary models provides a statistical framework to search for evidence of such episodic selection. For computational tractability, current branch-site models unrealistically assume that all branches in the tree can be partitioned a priori into two rigid classes--"foreground" branches that are allowed to undergo diversifying selective bursts and "background" branches that are negatively selected or neutral. We demonstrate that this assumption leads to unacceptably high rates of false positives or false negatives when the evolutionary process along background branches strongly deviates from modeling assumptions. To address this problem, we extend Felsenstein's pruning algorithm to allow efficient likelihood computations for models in which variation over branches (and not just sites) is described in the random effects likelihood framework. This enables us to model the process at every branch-site combination as a mixture of three Markov substitution models--our model treats the selective class of every branch at a particular site as an unobserved state that is chosen independently of that at any other branch. When benchmarked on a previously published set of simulated sequences, our method consistently matched or outperformed existing branch-site tests in terms of power and error rates. Using three empirical data sets, previously analyzed for episodic selection, we discuss how modeling assumptions can influence inference in practical situations.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
An illustration of episodic selection profiles at a single site with three possible regimes: negative, neutral (or nearly neutral), and diversifying selection along a branch. Panel (A) depicts the phylogeny used for discussion in the text and to carry out robustness simulations; Branch 5 is designated as foreground (FG), and the remaining four branches as background (BG). Panel (B) illustrates the four a priori selective profiles allowed by the model of Zhang et al. (2005). Panel (C) shows 2 of 239 possible selective profiles not modeled by current branch-site models; these profiles are used in robustness simulations (see Methods).
F<sc>IG</sc>. 2.
FIG. 2.
Empirical data sets analyzed for episodic selection. Each tree is scaled on the expected number of substitutions/nucleotide. The hue of each color indicates strength of selection, with primary red corresponding to ω > 5, primary blue to ω = 0, and grey to ω = 1. The width of each color component represents the proportion of sites in the corresponding class. Thicker branches have been classified as undergoing episodic diversifying selection by the sequential test at p ≤ 0.05.

Similar articles

Cited by

References

    1. Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol. 2009;26:255–271. - PubMed
    1. Anisimova M, Yang Z. Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007;24:1219–1228. - PubMed
    1. Delport W, Scheffler K, Botha G, Gravenor MB, Muse SV, Kosakovsky Pond S. Codontest: modeling amino acid substitution preferences in coding sequences. PLoS Comput Biol. 2010;19:e1000885. - PMC - PubMed
    1. Delport W, Scheffler K, Seoighe C. Frequent toggling between alternative amino acids is driven by selection in HIV-1. PLoS Pathog. 2008;4:e1000242. - PMC - PubMed
    1. Delport W, Scheffler K, Seoighe C. Models of coding sequence evolution. Brief Bioinform. 2009;10:97–109. - PMC - PubMed

Publication types