Geometrical and sequence characteristics of alpha-helices in globular proteins

Biophys J. 1998 Oct;75(4):1935-44. doi: 10.1016/S0006-3495(98)77634-9.

Abstract

Understanding the sequence-structure relationships in globular proteins is important for reliable protein structure prediction and de novo design. Using a database of 1131 alpha-helices with nonidentical sequences from 205 nonhomologous globular protein chains, we have analyzed structural and sequence characteristics of alpha-helices. We find that geometries of more than 99% of all the alpha-helices can be simply characterised as being linear, curved, or kinked. Only a small number of alpha-helices ( approximately 4%) show sharp localized bends in their middle regions, and thus are classified as kinked. Approximately three-fourths (approximately 73%) of the alpha-helices in globular proteins show varying degrees of smooth curvature, with a mean radius of curvature of 65 +/- 33 A; longer helices are less curved. Computation of helix accessibility to the solvent indicates that nearly two-thirds of the helices ( approximately 66%) are largely buried in the protein core, and the length and geometry of the helices are not correlated with their location in the protein globule. However, the amino acid compositions and propensities of individual amino acids to occur in alpha-helices vary with their location in the protein globule, their geometries, and their lengths. In particular, Gln, Glu, Lys, and Arg are found more often in helices near the surface of globular proteins. Interestingly, kinks often seem to occur in regions where amino acids with low helix propensities (e.g., beta-branched and aromatic residues) cluster together, in addition to those associated with the occurrence of proline residues. Hence the propensities of individual amino acids to occur in a given secondary structure depend not only on conformation but also on its length, geometry, and location in the protein globule.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence*
  • Databases, Factual
  • Protein Structure, Secondary*
  • Proteins / chemistry*
  • Solvents

Substances

  • Proteins
  • Solvents