Sequence patterns derived from the automated prediction of functional residues in structurally-aligned homologous protein families

Bioinformatics. 2004 Oct 12;20(15):2380-9. doi: 10.1093/bioinformatics/bth255. Epub 2004 Apr 8.

Abstract

Motivation: Most proteins have evolved to perform specific functions that are dependent on the adoption of well-defined three-dimensional (3D) structures. Specific patterns of conserved residues in amino acid sequences of divergently evolved proteins are frequently observed; these may reflect evolutionary restraints arising both from the need to maintain tertiary structure and the requirement to conserve residues more directly involved in function. Databases of such sequence patterns are valuable in identifying distant homologues, in predicting function and in the study of evolution.

Results: A fully automated database of protein sequence patterns, Functional Protein Sequence Pattern Database (FPSPD), has been derived from the analysis of the conserved residues that are predicted to be functional in structurally aligned homologous families in the HOMSTRAD database. Environment-dependent substitution tables, evolutionary trace analysis, solvent accessibility calculations and 3D-structures were used to obtain the FPSPD. The method yielded 3584 patterns that are considered functional and 3049 patterns that are probably functional. FPSPD could be useful for assigning a protein to a homologous superfamily and thereby providing clues about function.

Availability: FPSPD is available at http://www-cryst.bioc.cam.ac.uk/~fpspd/

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Amino Acids / chemistry
  • Conserved Sequence
  • Databases, Protein*
  • Models, Molecular
  • Pattern Recognition, Automated / methods*
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / classification*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Structure-Activity Relationship

Substances

  • Amino Acids
  • Proteins