The SUPERFAMILY database in structural genomics

Acta Crystallogr D Biol Crystallogr. 2002 Nov;58(Pt 11):1897-900. doi: 10.1107/s0907444902015160. Epub 2002 Oct 21.

Abstract

The SUPERFAMILY hidden Markov model library representing all proteins of known structure predicts the domain architecture of protein sequences and classifies them at the SCOP superfamily level. This analysis has been carried out on all completely sequenced genomes. The ways in which the database can be useful to crystallographers is discussed, in particular with a view to high-throughput structure determination. The application of the SUPERFAMILY database to different target-selection strategies is suggested: novel folds, novel domain combinations and targeted attacks on genomes. Use of the database for more general inquiry in the context of structural studies is also explained. The database provides evolutionary relationships between target proteins and other proteins of known structure through the SCOP database, genome assignments and multiple sequence alignments.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Databases, Protein*
  • Escherichia coli / genetics
  • Evolution, Molecular
  • Genome*
  • Humans
  • Information Storage and Retrieval
  • Molecular Sequence Data
  • Protein Structure, Tertiary
  • Proteins / chemistry
  • Proteins / classification
  • Proteins / genetics*
  • Sequence Alignment

Substances

  • Proteins