Common recognition principles across diverse sequence and structural families of sialic acid binding proteins

Glycobiology. 2014 Jan;24(1):5-16. doi: 10.1093/glycob/cwt063. Epub 2013 Sep 16.


Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural analyses using a non-redundant data set of known sialic acid binding proteins was carried out, which included exhaustive binding site comparisons and site alignments using in-house algorithms, followed by clustering and tree computation, which has led to derivation of sialic acid recognition principles. Although the proteins in the data set belong to several sequence and structure families, their binding sites could be grouped into only six types. Structural comparison of the binding sites indicates that all sites contain one or more different combinations of key structural features over a common scaffold. The six binding site types thus serve as structural motifs for recognizing sialic acid. Scanning the motifs against a non-redundant set of binding sites from PDB indicated the motifs to be specific for sialic acid recognition. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. As an example analysis, a genome-wide scan for the motifs in structures of Mycobacterium tuberculosis proteome identified 17 hits that contain combinations of the features, suggesting a possible function of sialic acid binding by these proteins.

Keywords: binding site signature; ligand binding; sialic acid recognition; structural bioinformatics; structural motifs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Motifs
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics*
  • Bacterial Proteins / metabolism
  • Binding Sites
  • Lectins / chemistry
  • Lectins / genetics*
  • Lectins / metabolism
  • Mycobacterium tuberculosis / chemistry
  • Mycobacterium tuberculosis / genetics*
  • Mycobacterium tuberculosis / metabolism
  • N-Acetylneuraminic Acid / chemistry
  • N-Acetylneuraminic Acid / genetics*
  • N-Acetylneuraminic Acid / metabolism
  • Proteome / chemistry
  • Proteome / genetics*
  • Proteome / metabolism
  • Sequence Analysis, Protein


  • Bacterial Proteins
  • Lectins
  • Proteome
  • N-Acetylneuraminic Acid