PhySortR: a fast, flexible tool for sorting phylogenetic trees in R

PeerJ. 2016 May 12:4:e2038. doi: 10.7717/peerj.2038. eCollection 2016.

Abstract

A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable.

Keywords: phylogenetic trees; phylogenetics; phylogenomics.

Grants and funding

This work was supported by the Australian Research Council Discovery Project (DP150101875) grant awarded to MAR, CXC and DB. TGS is supported by an Australian Postgraduate Award. CXC is supported by a Great Barrier Reef Foundation Bioinformatics Fellowship awarded to MAR. DB acknowledges support from the National Science Foundation (1004213). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.