Multiple structural alignment and clustering of RNA sequences

Elfar Torarinsson; Jakob H Havgaard; Jan Gorodkin

doi:10.1093/bioinformatics/btm049

Multiple structural alignment and clustering of RNA sequences

Bioinformatics. 2007 Apr 15;23(8):926-32. doi: 10.1093/bioinformatics/btm049. Epub 2007 Feb 25.

Authors

Elfar Torarinsson¹, Jakob H Havgaard, Jan Gorodkin

Affiliation

¹ Division of Genetics and Bioinformatics, IBHV and Center for Bioinformatics, University of Copenhagen, Frederiksberg C, Denmark.

PMID: 17324941
DOI: 10.1093/bioinformatics/btm049

Abstract

Motivation: An apparent paradox in computational RNA structure prediction is that many methods, in advance, require a multiple alignment of a set of related sequences, when searching for a common structure between them. However, such a multiple alignment is hard to obtain even for few sequences with low sequence similarity without simultaneously folding and aligning them. Furthermore, it is of interest to conduct a multiple alignment of RNA sequence candidates found from searching as few as two genomic sequences.

Results: Here, based on the PMcomp program, we present a global multiple alignment program, foldalignM, which performs especially well on few sequences with low sequence similarity, and is comparable in performance with state of the art programs in general. In addition, it can cluster sequences based on sequence and structure similarity and output a multiple alignment for each cluster. Furthermore, preliminary results with local datasets indicate that the program is useful for post processing foldalign pairwise scans.

Availability: The program foldalignM is implemented in JAVA and is, along with some accompanying PERL scripts, available at http://foldalign.ku.dk/

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Base Sequence
Cluster Analysis*
Molecular Sequence Data
RNA / chemistry*
RNA / genetics*
Sequence Alignment / methods*
Sequence Analysis, RNA / methods*
Sequence Homology, Nucleic Acid
Software*

Substances

RNA