Discovery of O-GlcNAc-modified proteins in published large-scale proteome data

Mol Cell Proteomics. 2012 Oct;11(10):843-50. doi: 10.1074/mcp.M112.019463. Epub 2012 Jun 1.


The attachment of N-acetylglucosamine to serine or threonine residues (O-GlcNAc) is a post-translational modification on nuclear and cytoplasmic proteins with emerging roles in numerous cellular processes, such as signal transduction, transcription, and translation. It is further presumed that O-GlcNAc can exhibit a site-specific, dynamic and possibly functional interplay with phosphorylation. O-GlcNAc proteins are commonly identified by tandem mass spectrometry following some form of biochemical enrichment. In the present study, we assessed if, and to which extent, O-GlcNAc-modified proteins can be discovered from existing large-scale proteome data sets. To this end, we conceived a straightforward O-GlcNAc identification strategy based on our recently developed Oscore software that automatically analyzes tandem mass spectra for the presence and intensity of O-GlcNAc diagnostic fragment ions. Using the Oscore, we discovered hundreds of O-GlcNAc peptides not initially identified in these studies, and most of which have not been described before. Merely re-searching this data extended the number of known O-GlcNAc proteins by almost 100 suggesting that this modification exists even more widely than previously anticipated and the modification is often sufficiently abundant to be detected without enrichment. However, a comparison of O-GlcNAc and phospho-identifications from the very same data indicates that the O-GlcNAc modification is considerably less abundant than phosphorylation. The discovery of numerous doubly modified peptides (i.e. peptides with one or multiple O-GlcNAc or phosphate moieties), suggests that O-GlcNAc and phosphorylation are not necessarily mutually exclusive, but can occur simultaneously at adjacent sites.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acetylglucosamine / metabolism*
  • Amino Acid Sequence
  • Cell Line
  • Cell Nucleus / metabolism
  • Cytoplasm / metabolism
  • Databases, Protein
  • Humans
  • Molecular Sequence Data
  • Peptides / analysis*
  • Phosphoproteins / analysis*
  • Phosphorylation
  • Protein Processing, Post-Translational*
  • Proteome / analysis
  • Proteome / metabolism*
  • Serine / metabolism
  • Software*
  • Tandem Mass Spectrometry
  • Threonine / metabolism


  • Peptides
  • Phosphoproteins
  • Proteome
  • Threonine
  • Serine
  • Acetylglucosamine