Validation of the curation pipeline of UniCarb-DB: building a global glycan reference MS/MS repository

Biochim Biophys Acta. 2014 Jan;1844(1 Pt A):108-16. doi: 10.1016/j.bbapap.2013.04.018. Epub 2013 Apr 25.


The UniCarb-DB database is an emerging public glycomics data repository, containing over 500 tandem mass spectra (as of March 2013) of glycans released from glycoproteins. A major challenge in glycomics research is to provide and maintain high-quality datasets that will offer the necessary diversity to support the development of accurate bioinformatics tools for data deposition and analysis. The role of UniCarb-DB, as an archival database, is to provide the glycomics community with open-access to a comprehensive LC MS/MS library of N- and O- linked glycans released from glycoproteins that have been annotated with glycosidic and cross-ring fragmentation ions, retention times, and associated experimental metadata descriptions. Here, we introduce the UniCarb-DB data submission pipeline and its practical application to construct a library of LC-MS/MS glycan standards that forms part of this database. In this context, an independent consortium of three laboratories was established to analyze the same 23 commercially available oligosaccharide standards, all by using graphitized carbon-liquid chromatography (LC) electrospray ionization (ESI) ion trap mass spectrometry in the negative ion mode. A dot product score was calculated for each spectrum in the three sets of data as a measure of the comparability that is necessary for use of such a collection in library-based spectral matching and glycan structural identification. The effects of charge state, de-isotoping and threshold levels on the quality of the input data are shown. The provision of well-characterized oligosaccharide fragmentation data provides the opportunity to identify determinants of specific glycan structures, and will contribute to the confidence level of algorithms that assign glycan structures to experimental MS/MS spectra. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.

Keywords: Database; Glycan; Glycobiology; Glycomics; Mass spectrometry; Standards.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Chromatography, High Pressure Liquid
  • Polysaccharides / chemistry*
  • Tandem Mass Spectrometry / methods*


  • Polysaccharides