Amino acid pairing at the N- and C-termini of helical segments in proteins

Proteins. 2008 Jan 1;70(1):188-96. doi: 10.1002/prot.21525.

Abstract

A systematic survey was carried out in an unbiased sample of 815 protein chains with a maximum of 20% homology selected from the Protein Data Bank, whose structures were solved at a resolution higher than 1.6 A and with a R-factor lower than 25%. A set of 5556 subsequences with alpha-helix or 3(10)-helix motifs was extracted from the protein chains considered. Global and local propensities were then calculated for all possible amino acid pairs of the type (i, i + 1), (i, i + 2), (i, i + 3), and (i, i + 4), starting at the relevant helical positions N1, N2, N3, C3, C2, C1, and N-int (interior positions), and also at the first nonhelical positions in both termini of the helices, namely, N-cap and C-cap. The statistical analysis of the propensity values has shown that pairing is significantly dependent on the type of the amino acids and on the position of the pair. A few sequences of three and four amino acids were selected and their high prevalence in helices is outlined in this work. The Glu-Lys-Tyr-Pro sequence shows a peculiar distribution in proteins, which may suggest a relevant structural role in alpha-helices when Pro is located at the C-cap position. A bioinformatics tool was developed, which updates automatically and periodically the results and makes them available in a web site.

MeSH terms

  • Amino Acids / chemistry*
  • Hydrogen Bonding
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*

Substances

  • Amino Acids
  • Proteins