Improvements to CluSTr: the database of SWISS-PROT+TrEMBL protein clusters

Nucleic Acids Res. 2003 Jan 1;31(1):388-9. doi: 10.1093/nar/gkg035.

Abstract

The CluSTr database (http://www.ebi.ac.uk/clustr/) offers an automatic classification of SWISS-PROT+TrEMBL proteins into groups of related proteins. The clustering is based on analysis of all pair-wise sequence comparisons between proteins using the Smith-Waterman algorithm. The analysis, carried out on different levels of protein similarity, yields a hierarchical organization of clusters. Information about domain content of the clustered proteins is provided via the InterPro resource. The introduced InterPro 'condensed graphical view' simplifies the visual analysis of represented domain architectures. Integrated applications allow users to visualize and edit multiple alignments and build sequence divergence trees. Links to the relevant structural data in Protein Data Bank (PDB) and Homology derived Secondary Structure of Proteins (HSSP) are also provided.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cluster Analysis
  • Computer Graphics
  • Databases, Protein*
  • Internet
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / classification*
  • Sequence Alignment
  • User-Computer Interface

Substances

  • Proteins