Nonrandom distribution of intramolecular contacts in native single-domain proteins

Proteins. 2009 May 1;75(2):404-12. doi: 10.1002/prot.22258.

Abstract

The interplay of short- and long-range interactions in protein structure and folding is poorly understood. This study focuses on the distribution of intramolecular contacts across different regions of the polypeptide chain in soluble single-domain proteins. We show that while the average number of intramolecular interactions per residue is similar across all regions of the sequence, the interaction counterparts are distributed nonrandomly. Two types of proteins are observed. The first class comprises structures that have the majority of their intramolecular contacts linking amino acids within the same region of the sequence (i.e., N-/C-terminal or intermediate portion of the chain). A second smaller class includes proteins that have extensive contacts between the N and C termini. Such extensive interactions involve primarily distal beta-strands and are detected via the NCR parameter, a descriptor of the number of contacts with interaction counterparts in specific regions of the sequence. In summary, the majority of single-domain proteins (first class) is dominated by short-range interactions between contiguous elements of secondary structure and has only sparse contacts among the N and C termini. This finding defies the common assumption that the chain termini, often spatially close in folded proteins, have to participate in a large number of mutual interactions. Finally, our results suggest that the C-terminal region of Class 2 proteins may be particularly effective at promoting folding upon completion of protein biosynthesis in the cell.

MeSH terms

  • Amino Acid Sequence
  • Computer Simulation
  • Databases, Protein
  • Models, Molecular
  • Protein Conformation
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / metabolism

Substances

  • Proteins