[Information about the protein secondary structure improves quality of an alignment of protein sequences]

Mol Biol (Mosk). 2006 May-Jun;40(3):533-40.
[Article in Russian]

Abstract

All popular algorithms of pair-wise alignment of protein primary structures (e.g. Smith-Waterman (SW), FASTA, BLAST, et al.) utilize only amino acid sequences. The SW-algorithm is the most accurate among them, i.e. it produces alignments that are most similar to the alignments obtained by superposition of protein 3D-structures. But even the SW-algorithm is unable to restore the 3D-based alignment if similarity of amino acid sequences (%id) is below 30%. We have proposed a novel alignment method that explicitly takes into account the secondary structure of the compared proteins. We have shown that it creates significantly more accurate alignments compared to SW-algorithm. In particular, for sequences with %id < 30% the average accuracy of the new method is 58% compared to 35% for SW-algorithm (the accuracy of an algorithmic sequence alignment is the part of restored position of a "golden standard" alignment obtained by superposition of corresponding 3D-structures). The accuracy of the proposed method is approximately identical both for experimental, and for theoretically predicted secondary structures. Thus the method can be applied for alignment of protein sequences even if protein 3D-structure is unknown. The program is available at ftp://194.149.64.196/STRUSWER/.

Publication types

  • Comparative Study
  • English Abstract

MeSH terms

  • Algorithms*
  • Animals
  • Humans
  • Internet
  • Predictive Value of Tests
  • Protein Structure, Secondary*
  • Sequence Alignment*
  • Sequence Analysis, Protein*
  • Software*