GS-align for glycan structure alignment and similarity measurement

Bioinformatics. 2015 Aug 15;31(16):2653-9. doi: 10.1093/bioinformatics/btv202. Epub 2015 Apr 8.


Motivation: Glycans play critical roles in many biological processes, and their structural diversity is key for specific protein-glycan recognition. Comparative structural studies of biological molecules provide useful insight into their biological relationships. However, most computational tools are designed for protein structure, and despite their importance, there is no currently available tool for comparing glycan structures in a sequence order- and size-independent manner.

Results: A novel method, GS-align, is developed for glycan structure alignment and similarity measurement. GS-align generates possible alignments between two glycan structures through iterative maximum clique search and fragment superposition. The optimal alignment is then determined by the maximum structural similarity score, GS-score, which is size-independent. Benchmark tests against the Protein Data Bank (PDB) N-linked glycan library and PDB homologous/non-homologous N-glycoprotein sets indicate that GS-align is a robust computational tool to align glycan structures and quantify their structural similarity. GS-align is also applied to template-based glycan structure prediction and monosaccharide substitution matrix generation to illustrate its utility.

Availability and implementation:


Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Carbohydrate Conformation
  • Carbohydrate Sequence
  • Computational Biology / methods*
  • Glycoproteins / chemistry*
  • Humans
  • Molecular Sequence Data
  • Polysaccharides / chemistry*
  • Sequence Alignment / methods*
  • Software*


  • Glycoproteins
  • Polysaccharides