Mass spectrometry based glycoproteomics has become a major means of identifying and characterizing previously N-linked glycan attached loci (glycosites). In the bottom-up approach, several factors which include but not limited to sample preparation, mass spectrometry analyses, and protein sequence database searches result in previously N-linked peptide spectrum matches (PSMs) of varying lengths. Given that multiple PSM scan map to a glycosite, we reason that identified PSMs are varying length peptide species of a unique set of glycosites. Because associated spectra of these PSMs are typically summed separately, true glycosite associated spectra counts are lost or complicated. Also, these varying length peptide species complicate protein inference as smaller sized peptide sequences are more likely to map to more proteins than larger sized peptides or actual glycosite sequences. Here, we present XGlycScan. XGlycScan maps varying length peptide species to glycosites to facilitate an accurate quantification of glycosite associated spectra counts. We observed that this reduced the variability in reported identifications of mass spectrometry technical replicates of our sample dataset. We also observed that mapping identified peptides to glycosites provided an assessment of search-engine identification. Inherently, XGlycScan reported glycosites reduce the complexity in protein inference. We implemented XGlycScan in the platform independent Java programing language and have made it available as open source. XGlycScan's source code is freely available at https://bitbucket.org/paiyetan/xglycscan/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/xglycscan/downloads. The graphical user interface version can also be found at https://bitbucket.org/paiyetan/xglycscangui/src and https://bitbucket.org/paiyetan/xglycscangui/downloads respectively.
Keywords: Bioinformatics; Glycopeptide; Glycosite; Peptide; Protein identification; Proteomics; Quality assessment.