FastTree 2--approximately maximum-likelihood trees for large alignments

PLoS One. 2010 Mar 10;5(3):e9490. doi: 10.1371/journal.pone.0009490.

Abstract

Background: We recently described FastTree, a tool for inferring phylogenies for alignments with up to hundreds of thousands of sequences. Here, we describe improvements to FastTree that improve its accuracy without sacrificing scalability.

Methodology/principal findings: Where FastTree 1 used nearest-neighbor interchanges (NNIs) and the minimum-evolution criterion to improve the tree, FastTree 2 adds minimum-evolution subtree-pruning-regrafting (SPRs) and maximum-likelihood NNIs. FastTree 2 uses heuristics to restrict the search for better trees and estimates a rate of evolution for each site (the "CAT" approximation). Nevertheless, for both simulated and genuine alignments, FastTree 2 is slightly more accurate than a standard implementation of maximum-likelihood NNIs (PhyML 3 with default settings). Although FastTree 2 is not quite as accurate as methods that use maximum-likelihood SPRs, most of the splits that disagree are poorly supported, and for large alignments, FastTree 2 is 100-1,000 times faster. FastTree 2 inferred a topology and likelihood-based local support values for 237,882 distinct 16S ribosomal RNAs on a desktop computer in 22 hours and 5.8 gigabytes of memory.

Conclusions/significance: FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments. FastTree 2 is freely available at http://www.microbesonline.org/fasttree.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Computers
  • Data Interpretation, Statistical*
  • Databases, Protein
  • Genetic Techniques*
  • Humans
  • Likelihood Functions*
  • Models, Genetic
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics
  • Sequence Alignment / methods*
  • Software

Substances

  • RNA, Ribosomal, 16S