MATCH-BOX: A Fundamentally New Algorithm for the Simultaneous Alignment of Several Protein Sequences

Comput Appl Biosci. 1992 Oct;8(5):501-9. doi: 10.1093/bioinformatics/8.5.501.


Original algorithms for simultaneous alignment of protein sequences are presented, including sequence clustering and within- or between-groups multiple alignment. The way of matching similar regions is fundamentally new. Complete matches are formed by segments more similar than expected by random, according to a given probability limit. Any classic or user-defined score matrix can be used to express the similarity between the residues. The algorithm seeks for complete matches common to all the sequences without performing pairwise alignment and regardless of gap weighting. An automatic screening delineates all the similar regions (boxes) that may be defined for a given maximal shift between the sequences. The shift can be large enough to allow the matching of any region of a sequence with any region of another one. It can also be short and used to refine the alignment around anchor points. The algorithm provides the most likely optimal alignment and a comprehensive list of the alignment dilemma. Duality between automatism and interactivity is provided. Depending on the problem complexity, a final alignment is obtained fully automatically or requires some interactive handling to discriminate alternative pathways.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Cluster Analysis
  • Sequence Alignment*
  • Software Design