Side-chain clusters in protein structures and their role in protein folding

J Mol Biol. 1991 Jul 5;220(1):151-71. doi: 10.1016/0022-2836(91)90388-m.

Abstract

A method has been developed to detect dense clusters of residue side-chains in proteins, where contact is based upon the percentage of the maximum possible for a given residue type. The clusters represent protein sites with the highest degree of interaction amongst their member residues, while contacts with the environment surrounding the cluster are lower in number. The method has been applied to three distinct structural sets of proteins to check for consistency: mixed alpha-helical/beta-sheet proteins, all beta-strand proteins, and all alpha-helical proteins. A number of cluster features generated from these sets are of general interest for protein folding. (1) A majority of the clusters, comprising three to four residues on average, are localized near the protein surfaces and not within the protein cores. (2) The clusters have preferences for the N- and C-terminal ends of alpha-helices and beta-strands in alpha/beta and alpha-proteins, while beta-proteins utilize the middle strand regions more often. A number of clusters connect three or more beta-strands and/or alpha-helices. (3) More than half of the clusters display residue pairs with oppositely charged atoms within 4.5 A of each other. (4) The residue composition of the clusters does not show correlation with hydrophobicity measures but rather with side-chain volume and surface. The highly preferred cluster residues are (in order of decreasing preference) Trp, His, Arg, Tyr, Glu, Gln and Phe. Clusters with extensive internal contacts in related haemoglobin and immunoglobulin tertiary structures show respective conservation. Several examples illustrate "strategic" folding positions in proteins that often bring together a number of sheets and/or helices, suggesting a folding model in which largely preformed secondary structures are joined together in a cluster induced collapse. Alternatively, the clusters may form at some stage in the folding process to reduce considerably the searchable conformational space and help maintain the proper folding pathway. The clusters also provide hints for site-directed mutagenesis and protein engineering experiments as they are also suggested to be important for structural stability.

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Databases, Factual
  • Enzymes / chemistry*
  • Globins / chemistry
  • Globins / genetics
  • Immunoglobulins / genetics
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Conformation*
  • Proteins / chemistry*
  • Sequence Homology, Nucleic Acid

Substances

  • Enzymes
  • Immunoglobulins
  • Proteins
  • Globins