Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jul 4;49(1):31.
doi: 10.1186/s40659-016-0092-5.

50 years of amino acid hydrophobicity scales: revisiting the capacity for peptide classification

Affiliations

50 years of amino acid hydrophobicity scales: revisiting the capacity for peptide classification

Stefan Simm et al. Biol Res. .

Abstract

Background: Physicochemical properties are frequently analyzed to characterize protein-sequences of known and unknown function. Especially the hydrophobicity of amino acids is often used for structural prediction or for the detection of membrane associated or embedded β-sheets and α-helices. For this purpose many scales classifying amino acids according to their physicochemical properties have been defined over the past decades. In parallel, several hydrophobicity parameters have been defined for calculation of peptide properties. We analyzed the performance of separating sequence pools using 98 hydrophobicity scales and five different hydrophobicity parameters, namely the overall hydrophobicity, the hydrophobic moment for detection of the α-helical and β-sheet membrane segments, the alternating hydrophobicity and the exact ß-strand score.

Results: Most of the scales are capable of discriminating between transmembrane α-helices and transmembrane β-sheets, but assignment of peptides to pools of soluble peptides of different secondary structures is not achieved at the same quality. The separation capacity as measure of the discrimination between different structural elements is best by using the five different hydrophobicity parameters, but addition of the alternating hydrophobicity does not provide a large benefit. An in silico evolutionary approach shows that scales have limitation in separation capacity with a maximal threshold of 0.6 in general. We observed that scales derived from the evolutionary approach performed best in separating the different peptide pools when values for arginine and tyrosine were largely distinct from the value of glutamate. Finally, the separation of secondary structure pools via hydrophobicity can be supported by specific detectable patterns of four amino acids.

Conclusion: It could be assumed that the quality of separation capacity of a certain scale depends on the spacing of the hydrophobicity value of certain amino acids. Irrespective of the wealth of hydrophobicity scales a scale separating all different kinds of secondary structures or between soluble and transmembrane peptides does not exist reflecting that properties other than hydrophobicity affect secondary structure formation as well. Nevertheless, application of hydrophobicity scales allows distinguishing between peptides with transmembrane α-helices and β-sheets. Furthermore, the overall separation capacity score of 0.6 using different hydrophobicity parameters could be assisted by pattern search on the protein sequence level for specific peptides with a length of four amino acids.

Keywords: Alternate hydrophobicity; Amino acid pattern; Beta-sheet; Hydrophobicity scale; Transmembrane helix; Transmembrane sheets.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Clustering of hydrophobicity scales. Shown is the UPGMA tree of the clustered hydrophobicity scales based on the normalized amino acid value distances (see “Methods” section). The hydrophobicity scales are clustered to groups (a to z) within similarity larger than 0.07
Fig. 2
Fig. 2
Scoring scheme. Shown is the scoring scheme of the separation of two sequence pools. Both sequence pools (dark grey area, grey area) build a cloud. The overlap of both clouds (light grey area) as well as the sequences (bold points) in this overlap is used to calculate the separation capacity. The circle surrounding both clouds represents the volume of the outlayer
Fig. 3
Fig. 3
Separation of pools by hydrophobicity scales. a Shown is the overall separation value for each hydrophobicity scale for the secondary structure (orange), in silico tryptic digest (blue) and mixed (green) sequence pools as area plot. The hydrophobicity scales are sorted from highest to lowest value. b The same as in a but the separation value is calculated for the cluster of hydrophobicity scales
Fig. 4
Fig. 4
Separation capacity of specific sequence pools. Shown is the pairwise separation capacity for the scale 14 (a) and for the best value of any of all hydrophobicity scales as radar plot (b) focusing on separation capacity below 0.4 (left) and above 0.4 (right). Each line represents one pool, at which the separation to all other pools is represented by the according symbol
Fig. 5
Fig. 5
Influence of hydrophobicity parameter for separation. a Shown is the percentage of scenarios reaching a specific separation values for all sequence pools including outliers (dashed line) and without outliers (solid line). The dash-dotted line shows the best separated 5 % of all scenarios and serves as marginal value to detect the threshold for analyzing the influence of the different hydrophobicity parameter to the separation. b Shown is the influence on separation of the ten hydrophobicity parameters (Table 5) for the secondary structure based sequence pools (black), the sequence pools generated by digestion (white) and the combination of both (grey). The hydrophobicity parameters are paired (max., min.). The separation influence is calculated as absolute value of the difference between observed and expected frequency of the best 5 % of separated scenarios (Fig. 5a)
Fig. 6
Fig. 6
Amino acid pattern distribution. Shown is the percentage of occurrence of all possible amino acid pattern of a specific length in the different sequence pools. The length of the pattern varies from 2 to 5. 2 AA black circle; 3 AA red circle; 4 AA green triangle down; 5 AA yellow triangle up
Fig. 7
Fig. 7
Separation capacity using evolution of random in silico scales. Shown is the box plot of separation capacity distribution of the 98 real hydrophobicity scales (real), the 200 randomly created scales (random, see “Methods” section) and the six in silico evolution steps (evoS1 to evoS6). The evolutionary optimization of the evolutionary approach was analyzed for the best performing scale identified after each step (dashed line) and the predicted plateau of 0.588 is shown as dotted line
Fig. 8
Fig. 8
Distance of amino acid value in hydrophobicity scales. a Calculated was absolute difference between the values of two amino acids for the best performing evolutionary derived scale (first box), of scale 14 (highest S value, second box) and of scale 40 (lowest S value, third box) after normalization of the scales to (X-min)/(max–min). Green boxes mark distances below 0.1, dark green boxes below 0.01, red boxes distance above 0.9 and dark red boxes distance above 0.99. Combination framed in orange mark amino acids for which values should be rather similar as concluded from the low difference in the best performing and the evolutionary evolved scale, blue frames mark amino acid combinations for which values should be rather different as concluded from the low difference in the best performing and the evolutionary evolved scale and yellow frames mark amino acid combination for which the value difference is irrelevant. b The clusters with comparable (black lines) or distinct amino acids values (blue lines) are shown. c The arrow indicates the distance of amino acid values that should be present in a good performing scale

Similar articles

Cited by

References

    1. Langmuir I. Protein monolayers. Cold Spring Harbor Symp Quant Biol. 1938;1938(6):171–189. doi: 10.1101/SQB.1938.006.01.018. - DOI
    1. Langmuir I. The properties and structure of protein films. Proc Roy Inst Gt Britain. 1938;1938:30483–30496.
    1. White SH, Wimley WC. Membrane protein folding and stability: physical principles. Annu Rev Biophys Biomol Struct. 1999;1999(28):319–365. doi: 10.1146/annurev.biophys.28.1.319. - DOI - PubMed
    1. Mitaku S, Hirokawa T. Physicochemical factors for discriminating between soluble and membrane proteins: hydrophobicity of helical segments and protein length. Protein Eng. 1999;1999(12):953–957. doi: 10.1093/protein/12.11.953. - DOI - PubMed
    1. Sheen SJ. Comparison of chemical and functional properties of soluble leaf proteins from four plant species. J Agric Food Chem. 1991;1991(39):681–685. doi: 10.1021/jf00004a011. - DOI