Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference
- PMID: 26031838
- PMCID: PMC4538881
- DOI: 10.1093/sysbio/syv033
Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference
Abstract
Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms.
Keywords: alignment filtering; alignment trimming; molecular phylogeny; multiple sequence alignment; phylogenetic inference; phylogenetics; phylogeny.
© The Author(s) 2015. Published by Oxford University Press on behalf of the Society of Systematic Biologists.
Figures
Similar articles
-
Multiple Sequence Alignment Averaging Improves Phylogeny Reconstruction.Syst Biol. 2019 Jan 1;68(1):117-130. doi: 10.1093/sysbio/syy036. Syst Biol. 2019. PMID: 29771363 Free PMC article.
-
Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences.BMC Evol Biol. 2019 Jan 11;19(1):21. doi: 10.1186/s12862-019-1350-2. BMC Evol Biol. 2019. PMID: 30634908 Free PMC article.
-
The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.Pac Symp Biocomput. 2008:25-36. doi: 10.1142/9789812776136_0004. Pac Symp Biocomput. 2008. PMID: 18229674
-
Multiple sequence alignment: in pursuit of homologous DNA positions.Genome Res. 2007 Feb;17(2):127-35. doi: 10.1101/gr.5232407. Genome Res. 2007. PMID: 17272647 Review.
-
Multiple sequence alignment modeling: methods and applications.Brief Bioinform. 2016 Nov;17(6):1009-1023. doi: 10.1093/bib/bbv099. Epub 2015 Nov 27. Brief Bioinform. 2016. PMID: 26615024 Review.
Cited by
-
Very few sites can reshape the inferred phylogenetic tree.PeerJ. 2020 Jul 8;8:e8865. doi: 10.7717/peerj.8865. eCollection 2020. PeerJ. 2020. PMID: 32714649 Free PMC article.
-
Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty.BMC Ecol Evol. 2021 Nov 29;21(1):214. doi: 10.1186/s12862-021-01931-5. BMC Ecol Evol. 2021. PMID: 34844571 Free PMC article.
-
A new lineage of segmented RNA viruses infecting animals.Virus Evol. 2020 Jan 17;6(1):vez061. doi: 10.1093/ve/vez061. eCollection 2020 Jan. Virus Evol. 2020. PMID: 31976084 Free PMC article.
-
Genomic Diversity of the Ostreid Herpesvirus Type 1 Across Time and Location and Among Host Species.Front Microbiol. 2021 Jul 13;12:711377. doi: 10.3389/fmicb.2021.711377. eCollection 2021. Front Microbiol. 2021. PMID: 34326830 Free PMC article.
-
The more we search, the more we find: discovering and expanding the biodiversity in the ring nematode genus Xenocriconemella De Grisse and Loof, 1965 (Nematoda: Criconematidae).Zoological Lett. 2024 Mar 25;10(1):8. doi: 10.1186/s40851-024-00230-3. Zoological Lett. 2024. PMID: 38528566 Free PMC article.
References
-
- Castresana J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17:540–552. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
