Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear all
Comparative Study
. 2004 Feb 11;32(3):1037-49.
doi: 10.1093/nar/gkh253. Print 2004.

The Importance of Intrinsic Disorder for Protein Phosphorylation

Affiliations
Free PMC article
Comparative Study

The Importance of Intrinsic Disorder for Protein Phosphorylation

Lilia M Iakoucheva et al. Nucleic Acids Res. .
Free PMC article

Abstract

Reversible protein phosphorylation provides a major regulatory mechanism in eukaryotic cells. Due to the high variability of amino acid residues flanking a relatively limited number of experimentally identified phosphorylation sites, reliable prediction of such sites still remains an important issue. Here we report the development of a new web-based tool for the prediction of protein phosphorylation sites, DISPHOS (DISorder-enhanced PHOSphorylation predictor, http://www.ist.temple. edu/DISPHOS). We observed that amino acid compositions, sequence complexity, hydrophobicity, charge and other sequence attributes of regions adjacent to phosphorylation sites are very similar to those of intrinsically disordered protein regions. Thus, DISPHOS uses position-specific amino acid frequencies and disorder information to improve the discrimination between phosphorylation and non-phosphorylation sites. Based on the estimates of phosphorylation rates in various protein categories, the outputs of DISPHOS are adjusted in order to reduce the total number of misclassified residues. When tested on an equal number of phosphorylated and non-phosphorylated residues, the accuracy of DISPHOS reaches 76% for serine, 81% for threonine and 83% for tyrosine. The significant enrichment in disorder-promoting residues surrounding phosphorylation sites together with the results obtained by applying DISPHOS to various protein functional classes and proteomes, provide strong support for the hypothesis that protein phosphorylation predominantly occurs within intrinsically disordered protein regions.

Figures

Figure 1
Figure 1
The amino acid residues significantly enriched and depleted around phosphorylation sites. Each residue is assigned a property: surface (red) or buried (black) according to Janin’s scale (28) (A), charged (black) or neutral (green) (B), hydrophobic (black) or hydrophilic (blue) according to Eisenberg’s scale (39) (C), high (pink) or low (blue) flexibility index according to flexibility scale (27) (D). Following the definition of Vihinen et al. (27) alanine and threonine were considered to have high flexibility if flanked by residues with HFP, and they were considered to have low flexibility if surrounded by residues with LFP.
Figure 2
Figure 2
Sequence complexity distributions. Sequence complexity K2 was calculated for the sliding window of 45 residues. The data set ‘all disorder’ consisted of disordered regions characterized by X-ray diffraction (extracted from PDB-Select-25), NMR and CD (extracted from the literature). Globular-3D data set consisted of the ordered protein regions extracted from PDB and fibrous sequences such as coiled coils, collagen and silk fibroins were removed from this data set. A lower sequence complexity is observed for P-sites as compared with NP-sites (inset).
Figure 3
Figure 3
Comparison of amino acid compositions between disordered protein regions, P- and NP-sites. The composition for each data set is shown in comparison with the ordered Globular-3D data set. The results are presented as the difference between the composition of each data set and the composition of ordered globular protein regions: (Cdata set – CGlobular-3D) / CGlobular-3D. A negative bar indicates that the data set is depleted in the corresponding amino acid, and the positive bar indicates enrichment. Amino acid residues on the X-axes are arranged according to the flexibility scale (27). The middle residues for P- and NP-sites representing actual phosphorylation sites were excluded from the calculations. The error bars correspond to 1 SD.
Figure 4
Figure 4
The process of model building and testing.
Figure 5
Figure 5
Estimated percentages of S, T and Y phosphorylation sites in 12 functional protein categories from SWISS-PROT and in disordered and ordered data sets. The y-axis indicates the estimated percentage of phosphorylated S, T or Y residues in each data set. The data set ‘all disorder’ consisted of disordered protein regions characterized by X-ray diffraction (extracted from PDB-Select-25), NMR and CD (extracted from the literature). The data set ‘PDB order’ consisted of the ordered protein regions with known 3D structure extracted from PDB. The error bars correspond to 1 SD.

Similar articles

See all similar articles

Cited by 528 articles

See all "Cited by" articles

Publication types

Feedback