A symmetric-iterated multiple alignment of protein sequences

J Mol Biol. 1998 Feb 13;276(1):249-64. doi: 10.1006/jmbi.1997.1527.

Abstract

A new symmetric-iterative method for multiple alignment of protein sequences is presented. The method can be described as a combination of motif finding and dynamic programming procedures. It uses each sequence as a standard to which all sequences are aligned based on the significant segment pair alignment (SSPA) protocol. Sequences are further matched using a reduced scoring threshold to provide fillers and extensions between highly significant segment pair matches. The method produces alignment blocks that accommodate indels and are separated by variable-length unaligned segments. Construction of consensus sequences is iterative, assigning greater weights to more distantly related sequences. A consensus sequence and various measures of conservation at each aligned position can be used for comparisons between protein families, for data base searches, and for analysis of functional and evolutionary features. The method is illustrated on the extended family of prokaryotic and eukaryotic RecA-like sequences. The RecA-like sequences reveal extended alignments among eubacterial RecA and separately among eukaryotic/archaebacterial Rad51/RadA. Eleven conserved blocks are common to both groups, two of them encompassing the ATP-binding A and B-sites. Among the most conserved positions are glycine residues. For example, they occur twice as doublets putatively serving as hinge connections that provide opportunity for alternative structural conformations. Also several charged/polar residues are highly conserved, probably consequent upon the extensive intermonomer interactions in RecA/Rad51 filament formation and possibly relevant protein-protein and protein-nucleic acid interactions.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Consensus Sequence
  • Eukaryotic Cells / chemistry
  • Evaluation Studies as Topic
  • Molecular Sequence Data
  • Multigene Family
  • Prokaryotic Cells / chemistry
  • Rec A Recombinases / chemistry
  • Sequence Alignment / methods*
  • Sequence Homology, Amino Acid
  • Software

Substances

  • Rec A Recombinases