Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 32 (1), 268-74

IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

Affiliations

IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

Lam-Tung Nguyen et al. Mol Biol Evol.

Abstract

Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to find optimal trees, it is not clear whether the best tree is found. Thus, there is need for additional approaches that employ different search strategies to find ML trees and that are at the same time as fast as currently available ML programs. We show that a combination of hill-climbing approaches and a stochastic perturbation method can be time-efficiently implemented. If we allow the same CPU time as RAxML and PhyML, then our software IQ-TREE found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space. If we use the IQ-TREE stopping rule, RAxML and PhyML are faster in 75.7% and 47.1% of the DNA alignments and 42.2% and 100% of the protein alignments, respectively. However, the range of obtaining higher likelihoods with IQ-TREE improves to 73.3-97.1%. IQ-TREE is freely available at http://www.cibiv.at/software/iqtree.

Keywords: maximum likelihood; phylogenetic inference; phylogeny; stochastic algorithm.

Figures

F<sc>ig</sc>. 1.
Fig. 1.
Performance of IQ-TREE for fixed CPU times: (a, b) Display frequencies of log-likelihood differences for IQ-TREE minus RAxML for 70 DNA (a) and 45 AA (b) alignments. (c) and (d) show the same if IQ-TREE is compared with PhyML. IQ-TREE’s CPU times were limited to those required by RAxML and PhyML, respectively. The percentages on the dashed line in (b) and (d) represent the fraction of alignments where log-likelihood differences are smaller than 0.01.
F<sc>ig</sc>. 2.
Fig. 2.
Performance of IQ-TREE for variable CPU times: The upper plots (a, b) show the performance of IQ-TREE against RAxML using the 70 DNA (a) and 45 AA (b) alignments. The lower plots (c, d) show the same against PhyML. Each dot in the main diagrams represents for one alignment the mean differences of the CPU times (y axis) and of the mean differences of log-likelihoods (x axis) of the reconstructed trees by the programs compared. The whiskers at each point show the standard errors of the differences. The histograms at the top and the side present the marginal frequencies. Dots to the right of the vertical dashed line represent alignments where IQ-TREE found a higher likelihood. If a dot is below the horizontal dashed line, the reconstruction by IQ-TREE was faster. Percentages in the quadrants of histograms denote the fraction of alignments in that region. Percentages on the dashed line reflect the number of alignments where log-likelihood differences are smaller than 0.01 (see [b] and [d]).
F<sc>ig</sc>. 3.
Fig. 3.
Flowchart for the stochastic search algorithm. The variable count counts the number of random perturbations (box b and box c) as a new best tree was found.

Similar articles

See all similar articles

Cited by 1,207 PubMed Central articles

See all "Cited by" articles

References

    1. Chor B, Tuller T. Maximum likelihood of evolutionary trees is hard. Lect Notes Comput Sci. 2005;3500:296–310.
    1. Farris JS. Methods for computing Wagner trees. Syst Zool. 1970;19:83–92.
    1. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17:368–376. - PubMed
    1. Felsenstein J. Inferring phylogenies. Sunderland (MA): Sinauer Associates; 2004.
    1. Fitch WM. Toward defining course of evolution—minimum change for a specific tree topology. Syst Zool. 1971;20:406–416.

Publication types

Feedback