MSARC: Multiple sequence alignment by residue clustering

Algorithms Mol Biol. 2014 Apr 16:9:12. doi: 10.1186/1748-7188-9-12. eCollection 2014.

Abstract

Background: Progressive methods offer efficient and reasonably good solutions to the multiple sequence alignment problem. However, resulting alignments are biased by guide-trees, especially for relatively distant sequences.

Results: We propose MSARC, a new graph-clustering based algorithm that aligns sequence sets without guide-trees. Experiments on the BAliBASE dataset show that MSARC achieves alignment quality similar to the best progressive methods. Furthermore, MSARC outperforms them on sequence sets whose evolutionary distances are difficult to represent by a phylogenetic tree. These datasets are most exposed to the guide-tree bias of alignments.

Availability: MSARC is available at http://bioputer.mimuw.edu.pl/msarc.

Keywords: Graph partitioning; Multiple sequence alignment; Stochastic alignment.