An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins

R Sowdhamini; T L Blundell

doi:10.1002/pro.5560040317

An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins

Protein Sci. 1995 Mar;4(3):506-20. doi: 10.1002/pro.5560040317.

Authors

R Sowdhamini¹, T L Blundell

Affiliation

¹ Department of Crystallography, Birkbeck College, London, United Kingdom.

Abstract

With a growing number of structures available in the Brookhaven Protein Data Bank, automatic methods for domain identification are required for the construction of databases. Domains are considered to be clusters of secondary structure elements. Thus, helices and strands are first clustered using intersecondary structural distances between C alpha positions, and dendrograms based on this distance measure are used to identify domains. Individual domains are recognized by a disjoint factor, which enables the automatic identification and classification into disjoint, interacting, and conjoint domains. Application to a database of 83 protein families and 18 unique structures shows that the approach provides an effective delineation of boundaries and identifies those proteins that can be considered as a single domain. A quantitative estimate of the interaction between domains has been proposed. The database of protein domains is a useful tool for understanding protein folding, for recognizing protein folds, and for understanding structure-activity relationships.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Aspartic Acid Endopeptidases / chemistry
Calmodulin / chemistry
Cluster Analysis*
Databases, Factual
Hydroxymethylbilane Synthase / chemistry
Models, Chemical
Models, Molecular
Papain / chemistry
Porins / chemistry
Protein Structure, Secondary*
Protein Structure, Tertiary*
Sequence Alignment

Substances

Calmodulin
Porins
Hydroxymethylbilane Synthase
Papain
Aspartic Acid Endopeptidases
Endothia aspartic proteinase