Assessing the applicability of the GTR nucleotide substitution model through simulations

Evol Bioinform Online. 2007 Feb 4:2:145-55.

Abstract

The General Time Reversible (GTR) model of nucleotide substitution is at the core of many distance-based and character-based phylogeny inference methods. The procedure described by Waddell and Steel (1997), for estimating distances and instantaneous substitution rate matrices, R, under the GTR model, is known to be inapplicable under some conditions, ie, it leads to the inapplicability of the GTR model. Here, we simulate the evolution of DNA sequences along 12 trees characterized by different combinations of tree length, (non-)homogeneity of the substitution rate matrix R, and sequence length. We then evaluate both the frequency of the GTR model inapplicability for estimating distances and the accuracy of inferred alignments. Our results indicate that, inapplicability of the Waddel and Steel's procedure can be considered a real practical issue, and illustrate that the probability of this inapplicability is a function of substitution rates and sequence length.We also discuss the implications of our results on the current implementations of maximum likelihood and Bayesian methods.

Keywords: GTR model; homogeneity; nucleotide substitution; phylogeny inference; simulations.