A major challenge in glycomics is the characterization of complex glycan structures that are essential for understanding their diverse roles in many biological processes. We present a novel efficient computational approach, named GlycoDeNovo, for accurate elucidation of the glycan topologies from their tandem mass spectra. Given a spectrum, GlycoDeNovo first builds an interpretation-graph specifying how to interpret each peak using preceding interpreted peaks. It then reconstructs the topologies of peaks that contribute to interpreting the precursor ion. We theoretically prove that GlycoDeNovo is highly efficient. A major innovative feature added to GlycoDeNovo is a data-driven IonClassifier which can be used to effectively rank candidate topologies. IonClassifier is automatically learned from experimental spectra of known glycans to distinguish B- and C-type ions from all other ion types. Our results showed that GlycoDeNovo is robust and accurate for topology reconstruction of glycans from their tandem mass spectra. Graphical Abstract ᅟ.
Keywords: De novo glycan sequencing; Electronic excitation dissociation; Fourier-transform ion cyclotron resonance mass spectrometry; Machine learning.