TreeSwift: A massively scalable Python tree package
- PMID: 35903557
- PMCID: PMC9328415
- DOI: 10.1016/j.softx.2020.100436
TreeSwift: A massively scalable Python tree package
Abstract
Phylogenetic trees are essential to evolutionary biology, and numerous methods exist that attempt to extract phylogenetic information applicable to a wide range of disciplines, such as epidemiology and metagenomics. Currently, the three main Python packages for trees are Bio.Phylo, DendroPy, and the ETE Toolkit, but as dataset sizes grow, parsing and manipulating ultra-large trees becomes impractical for these tools. To address this issue, we present TreeSwift, a user-friendly and massively scalable Python package for traversing and manipulating trees that is ideal for algorithms performed on ultra-large trees.
Keywords: Phylogenetics; Python; Scalable; Tree traversal.
Figures
Similar articles
-
ETE: a python Environment for Tree Exploration.BMC Bioinformatics. 2010 Jan 13;11:24. doi: 10.1186/1471-2105-11-24. BMC Bioinformatics. 2010. PMID: 20070885 Free PMC article.
-
DendroPy: a Python library for phylogenetic computing.Bioinformatics. 2010 Jun 15;26(12):1569-71. doi: 10.1093/bioinformatics/btq228. Epub 2010 Apr 25. Bioinformatics. 2010. PMID: 20421198
-
Bio.Phylo: a unified toolkit for processing, analyzing and visualizing phylogenetic trees in Biopython.BMC Bioinformatics. 2012 Aug 21;13:209. doi: 10.1186/1471-2105-13-209. BMC Bioinformatics. 2012. PMID: 22909249 Free PMC article.
-
OpenTree: A Python Package for Accessing and Analyzing Data from the Open Tree of Life.Syst Biol. 2021 Oct 13;70(6):1295-1301. doi: 10.1093/sysbio/syab033. Syst Biol. 2021. PMID: 33970279 Free PMC article.
-
Inferring trees.Methods Mol Biol. 2008;452:287-309. doi: 10.1007/978-1-60327-159-2_14. Methods Mol Biol. 2008. PMID: 18566770 Review.
Cited by
-
Real-time identification of epistatic interactions in SARS-CoV-2 from large genome collections.Genome Biol. 2024 Aug 22;25(1):228. doi: 10.1186/s13059-024-03355-y. Genome Biol. 2024. PMID: 39175058 Free PMC article.
-
TreeCluster: Clustering biological sequences using phylogenetic trees.PLoS One. 2019 Aug 22;14(8):e0221068. doi: 10.1371/journal.pone.0221068. eCollection 2019. PLoS One. 2019. PMID: 31437182 Free PMC article.
-
APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments.Syst Biol. 2020 May 1;69(3):566-578. doi: 10.1093/sysbio/syz063. Syst Biol. 2020. PMID: 31545363 Free PMC article.
-
DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.Syst Biol. 2022 Apr 19;71(3):610-629. doi: 10.1093/sysbio/syab070. Syst Biol. 2022. PMID: 34450658 Free PMC article.
-
Timing the SARS-CoV-2 Index Case in Hubei Province.bioRxiv [Preprint]. 2020 Nov 24:2020.11.20.392126. doi: 10.1101/2020.11.20.392126. bioRxiv. 2020. Update in: Science. 2021 Apr 23;372(6540):412-417. doi: 10.1126/science.abf8003 PMID: 33269353 Free PMC article. Updated. Preprint.
References
-
- Rose R, Lamers SL, Dollar JJ, Grabowski MK, Hodcroft EB, Ragonnet-Cronin M, Wertheim JO, Redd AD, German D, Laeyendecker O. Identifying transmission clusters with cluster picker and HIV-TRACE. AIDS Res Human Retrovir 2017;33(3):211–8. 10.1089/aid.2016.0205, URL 10.1089/aid.2016.0205. - DOI - DOI - PMC - PubMed
-
- Kembel SW, Eisen JA, Pollard KS, Green JL. The phylogenetic diversity of metagenomes. PLoS One 2011;6(8). e23214. 10.1371/journal.pone.0023214, URL 10.1371/journal.pone.0023214, arXiv:arXiv:1208.5792v1. - DOI - DOI - PMC - PubMed
-
- Darling AE, Jospin G, Lowe E, Matsen FA, Bik HM, Eisen JA. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2014;2. e243. 10.7717/peerj.243, URL https://peerj.com/articles/243/. - DOI - PMC - PubMed