An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets
- PMID: 16273072
- DOI: 10.1038/nbt1146
An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets
Abstract
With the recent exponential increase in protein phosphorylation sites identified by mass spectrometry, a unique opportunity has arisen to understand the motifs surrounding such sites. Here we present an algorithm designed to extract motifs from large data sets of naturally occurring phosphorylation sites. The methodology relies on the intrinsic alignment of phospho-residues and the extraction of motifs through iterative comparison to a dynamic statistical background. Results show the identification of dozens of novel and known phosphorylation motifs from recently published serine, threonine and tyrosine phosphorylation studies. When applied to a linguistic data set to test the versatility of the approach, the algorithm successfully extracted hundreds of language motifs. This method, in addition to shedding light on the consensus sequences of identified and as yet unidentified kinases and modular protein domains, may also eventually be used as a tool to determine potential phosphorylation sites in proteins of interest.
Similar articles
-
Computational prediction of protein-protein interactions.Methods Mol Biol. 2004;261:445-68. doi: 10.1385/1-59259-762-9:445. Methods Mol Biol. 2004. PMID: 15064475 Review.
-
Fast model-based protein homology detection without alignment.Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8. Bioinformatics. 2007. PMID: 17488755
-
VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins.J Proteome Res. 2005 Nov-Dec;4(6):2338-47. doi: 10.1021/pr050264q. J Proteome Res. 2005. PMID: 16335983
-
Conservative extraction of over-represented extensible motifs.Bioinformatics. 2005 Jun;21 Suppl 1:i9-18. doi: 10.1093/bioinformatics/bti1051. Bioinformatics. 2005. PMID: 15961503
-
Proteome informatics I: bioinformatics tools for processing experimental data.Proteomics. 2006 Oct;6(20):5435-44. doi: 10.1002/pmic.200600273. Proteomics. 2006. PMID: 16991191 Review.
Cited by
-
Chronic exposure to cigarette smoke leads to activation of p21 (RAC1)-activated kinase 6 (PAK6) in non-small cell lung cancer cells.Oncotarget. 2016 Sep 20;7(38):61229-61245. doi: 10.18632/oncotarget.11310. Oncotarget. 2016. PMID: 27542207 Free PMC article.
-
Tandem mass spectrometry identifies many mouse brain O-GlcNAcylated proteins including EGF domain-specific O-GlcNAc transferase targets.Proc Natl Acad Sci U S A. 2012 May 8;109(19):7280-5. doi: 10.1073/pnas.1200425109. Epub 2012 Apr 19. Proc Natl Acad Sci U S A. 2012. PMID: 22517741 Free PMC article.
-
Proteomics and phosphoproteomics analysis of human lens fiber cell membranes.Invest Ophthalmol Vis Sci. 2013 Feb 7;54(2):1135-43. doi: 10.1167/iovs.12-11168. Invest Ophthalmol Vis Sci. 2013. PMID: 23349431 Free PMC article.
-
A quantitative map of the liver mitochondrial phosphoproteome reveals posttranslational control of ketogenesis.Cell Metab. 2012 Nov 7;16(5):672-83. doi: 10.1016/j.cmet.2012.10.004. Cell Metab. 2012. PMID: 23140645 Free PMC article.
-
Identification of direct tyrosine kinase substrates based on protein kinase assay-linked phosphoproteomics.Mol Cell Proteomics. 2013 Oct;12(10):2969-80. doi: 10.1074/mcp.O113.027722. Epub 2013 Jun 22. Mol Cell Proteomics. 2013. PMID: 23793017 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
