Local multiple sequence alignment using dead-end elimination

Bioinformatics. 1999 Nov;15(11):947-53. doi: 10.1093/bioinformatics/15.11.947.

Abstract

Motivation: Local multiple sequence alignment is a basic tool for extracting functionally important regions shared by a family of protein sequences. We present an effectively polynomial-time algorithm for rigorously solving the local multiple alignment problem.

Results: The algorithm is based on the dead-end elimination procedure that makes it possible to avoid an exhaustive search. In the framework of the sum-of-pairs scoring system, certain rejection criteria are derived in order to eliminate those sequence segments and segment pairs that can be mathematically shown to be inconsistent (dead-ending) with the globally optimal alignment. Iterative application of the elimination criteria results in a rapid reduction of combinatorial possibilities without considering them explicitly. In the vast majority of cases, the procedure converges to a unique globally optimal solution. In contrast to the exhaustive search, whose computational complexity is combinatorial, the algorithm is computationally feasible because the number of operations required to eliminate the dead-ending segments and segment pairs grows quadratically and cubically, respectively, with the total number of sequence elements. The method is illustrated on a set of protein families for which the globally optimal alignments are well recognized.

Availability: The source code of the program implementing the algorithm is available upon request from the authors.

Contact: alex_lukashin@biogen.com.

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Evaluation Studies as Topic
  • Predictive Value of Tests
  • Reproducibility of Results
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Time Factors