Thirty-plus functional families from a single motif

Protein Sci. 2000 Dec;9(12):2470-6. doi: 10.1110/ps.9.12.2470.

Abstract

It is now possible to identify over 30 functional subfamilies among the WD-repeat-containing proteins found in the completed genomes. The majority of these subfamilies have at least one member for which experimental data allow assignment to a cellular pathway or process. Half of the 63 WD-repeat-containing proteins in Saccharomyces cerevisiae, half of the 70 in Caenorhabditis elegans, and a third of the 100 plus predicted in Drosophila can be assigned to 23 of these functional subfamilies. Perhaps indicative of the future, 33 WD-repeat-containing proteins from the partial genome of Arabidopsis thaliana can now be assigned to 18 of these subfamilies. These assignments have been made possible by combining traditional sequence similarity with an implied common beta propeller structural context to obtain measures of protein-protein surface similarity. The beta propeller structural context is represented in the form of a Hidden Markov Model. The procedure is completely automated.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Motifs / genetics*
  • Animals
  • Aspartic Acid
  • Genomics
  • Humans
  • Markov Chains
  • Multigene Family*
  • Protein Conformation
  • Protein Structure, Tertiary
  • Proteins / classification
  • Repetitive Sequences, Amino Acid*
  • Tryptophan

Substances

  • Proteins
  • Aspartic Acid
  • Tryptophan