Multiple alignment by sequence annealing

Ariel S Schwartz; Lior Pachter

doi:10.1093/bioinformatics/btl311

Multiple alignment by sequence annealing

Bioinformatics. 2007 Jan 15;23(2):e24-9. doi: 10.1093/bioinformatics/btl311.

Authors

Ariel S Schwartz¹, Lior Pachter

Affiliation

¹ EECS, Computer Science Division, University of California Berkeley, CA 94720, USA. sariel@cs.berkeley.edu

PMID: 17237099
DOI: 10.1093/bioinformatics/btl311

Abstract

Motivation: We introduce a novel approach to multiple alignment that is based on an algorithm for rapidly checking whether single matches are consistent with a partial multiple alignment. This leads to a sequence annealing algorithm, which is an incremental method for building multiple sequence alignments one match at a time. Our approach improves significantly on the standard progressive alignment approach to multiple alignment.

Results: The sequence annealing algorithm performs well on benchmark test sets of protein sequences. It is not only sensitive, but also specific, drastically reducing the number of incorrectly aligned residues in comparison to other programs. The method allows for adjustment of the sensitivity/specificity tradeoff and can be used to reliably identify homologous regions among protein sequences.

Availability: An implementation of the sequence annealing algorithm is available at http://bio.math.berkeley.edu/amap/

Publication types

Evaluation Study
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Amino Acid Sequence
Molecular Sequence Data
Proteins / chemistry*
Sequence Alignment / methods*
Sequence Analysis, Protein / methods*

Substances

Proteins

Grants and funding

R01HG2362/HG/NHGRI NIH HHS/United States