COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance

J Mol Biol. 2003 Feb 7;326(1):317-36. doi: 10.1016/s0022-2836(02)01371-2.

Abstract

We present a novel method for the comparison of multiple protein alignments with assessment of statistical significance (COMPASS). The method derives numerical profiles from alignments, constructs optimal local profile-profile alignments and analytically estimates E-values for the detected similarities. The scoring system and E-value calculation are based on a generalization of the PSI-BLAST approach to profile-sequence comparison, which is adapted for the profile-profile case. Tested along with existing methods for profile-sequence (PSI-BLAST) and profile-profile (prof_sim) comparison, COMPASS shows increased abilities for sensitive and selective detection of remote sequence similarities, as well as improved quality of local alignments. The method allows prediction of relationships between protein families in the PFAM database beyond the range of conventional methods. Two predicted relations with high significance are similarities between various Rossmann-type folds and between various helix-turn-helix-containing families. The potential value of COMPASS for structure/function predictions is illustrated by the detection of an intricate homology between the DNA-binding domain of the CTF/NFI family and the MH1 domain of the Smad family.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Homology, Amino Acid
  • Software*
  • Statistics as Topic

Substances

  • Proteins