Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Filters applied. Clear all
. 2010 Jan 18;11 Suppl 1(Suppl 1):S10.
doi: 10.1186/1471-2105-11-S1-S10.

PostMod: Sequence Based Prediction of Kinase-Specific Phosphorylation Sites With Indirect Relationship

Affiliations
Free PMC article

PostMod: Sequence Based Prediction of Kinase-Specific Phosphorylation Sites With Indirect Relationship

Inkyung Jung et al. BMC Bioinformatics. .
Free PMC article

Abstract

Background: Post-translational modifications (PTMs) have a key role in regulating cell functions. Consequently, identification of PTM sites has a significant impact on understanding protein function and revealing cellular signal transductions. Especially, phosphorylation is a ubiquitous process with a large portion of proteins undergoing this modification. Experimental methods to identify phosphorylation sites are labor-intensive and of high-cost. With the exponentially growing protein sequence data, development of computational approaches to predict phosphorylation sites is highly desirable.

Results: Here, we present a simple and effective method to recognize phosphorylation sites by combining sequence patterns and evolutionary information and by applying a novel noise-reducing algorithm. We suggested that considering long-range region surrounding a phosphorylation site is important for recognizing phosphorylation peptides. Also, from compared results to AutoMotif in 36 different kinase families, new method outperforms AutoMotif. The mean accuracy, precision, and recall of our method are 0.93, 0.67, and 0.40, respectively, whereas those of AutoMotif with a polynomial kernel are 0.91, 0.47, and 0.17, respectively. Also our method shows better or comparable performance in four main kinase groups, CDK, CK2, PKA, and PKC compared to six existing predictors.

Conclusion: Our method is remarkable in that it is powerful and intuitive approach without need of a sophisticated training algorithm. Moreover, our method is generally applicable to other types of PTMs.

Figures

Figure 1
Figure 1
Illustration of the noise-reducing system. Illustration of the noise-reducing system. In step 1, we find the top 5 hits for a given query, where Pj is a phosphorylation peptide and Nj is a non-phosphorylation peptide. Next, Scombined scores are calculated between the top 5 hits and all peptides in a reference set (10 peptides), where if a peptide i is not included in top 5 hits for a peptide j the score (j, i) is set to zero. In step 3, by summing each row of indirect relationship matrix we calculate indirect scores. During summation we assume that scores between positive (or negative) peptides are signal, while those between positive (or negative) and negative (or positive) are noise. Finally, we check the number of phosphorylation peptides among the top 4 hits by indirect scores. In this example P2, P3, P4, and N2 are recognized as the top 4 hits, and among them 3 peptides are phosphorylation peptides, and thereby we predict that the query peptide is a phosphorylation peptide.
Figure 2
Figure 2
ROC curves with various features. The figure shows number of true matches (phosphorylation peptides) according to number of false matches (non-phosphorylation peptides) up to 1000 false matches among 48 kinase families. In figure, values in brackets represent number of windowed residues in peptides. From the figure we note that Scombined with Sprofile (41 windowed residues) shows best performance. The fact remarks that considering long-range region is effective to identify phosphorylation peptides.
Figure 3
Figure 3
The PostMod server input page (A) and result page (B). In input page, search sequence is pasted into the text box and one of 48 kinase types is selected. The default kinase type is AMPK_group. The example sequence is AMPK beta-1 chain (UniProt id is P80386). The search result of phosphorylation sites are shown in (B). There are 36 candidate phosphorylation sites (S, T) and three of them are recognized as phosphorylation sites (bolded line).

Similar articles

See all similar articles

Cited by 10 articles

See all "Cited by" articles

References

    1. Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T, Blom N, Gibson TJ. Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics. 2004;5:79. doi: 10.1186/1471-2105-5-79. - DOI - PMC - PubMed
    1. Diella F, Gould CM, Chica C, Via A, Gibson TJ. Phospho.ELM: a database of phosphorylation sites--update 2008. Nucleic Acids Res. 2008. pp. D240–244. - PMC - PubMed
    1. Hornbeck PV, Chabra I, Kornhauser JM, Skrzypek E, Zhang B. PhosphoSite: A bioinformatics resource dedicated to physiological protein phosphorylation. Proteomics. 2004;4(6):1551–1561. doi: 10.1002/pmic.200300772. - DOI - PubMed
    1. Heazlewood JL, Durek P, Hummel J, Selbig J, Weckwerth W, Walther D, Schulze WX. PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor. Nucleic Acids Res. 2008. pp. D1015–1021. - PMC - PubMed
    1. Ingrell CR, Miller ML, Jensen ON, Blom N. NetPhosYeast: prediction of protein phosphorylation sites in yeast. Bioinformatics. 2007;23(7):895–897. doi: 10.1093/bioinformatics/btm020. - DOI - PubMed

Publication types

LinkOut - more resources

Feedback