Homology between O-linked GlcNAc transferases and proteins of the glycogen phosphorylase superfamily

J Mol Biol. 2001 Nov 30;314(3):365-74. doi: 10.1006/jmbi.2001.5151.


The O-linked GlcNAc transferases (OGTs) are a recently characterized group of largely eukaryotic enzymes that add a single beta-N-acetylglucosamine moiety to specific serine or threonine hydroxyls. In humans, this process may be part of a sugar regulation mechanism or cellular signaling pathway that is involved in many important diseases, such as diabetes, cancer, and neurodegeneration. However, no structural information about the human OGT exists, except for the identification of tetratricopeptide repeats (TPR) at the N terminus. The locations of substrate binding sites are unknown and the structural basis for this enzyme's function is not clear. Here, remote homology is reported between the OGTs and a large group of diverse sugar processing enzymes, including proteins with known structure such as glycogen phosphorylase, UDP-GlcNAc 2-epimerase, and the glycosyl transferase MurG. This relationship, in conjunction with amino acid similarity spanning the entire length of the sequence, implies that the fold of the human OGT consists of two Rossmann-like domains C-terminal to the TPR region. A conserved motif in the second Rossmann domain points to the UDP-GlcNAc donor binding site. This conclusion is supported by a combination of statistically significant PSI-BLAST hits, consensus secondary structure predictions, and a fold recognition hit to MurG. Additionally, iterative PSI-BLAST database searches reveal that proteins homologous to the OGTs form a large and diverse superfamily that is termed GPGTF (glycogen phosphorylase/glycosyl transferase). Up to one-third of the 51 functional families in the CAZY database, a glycosyl transferase classification scheme based on catalytic residue and sequence homology considerations, can be unified through this common predicted fold. GPGTF homologs constitute a substantial fraction of known proteins: 0.4% of all non-redundant sequences and about 1% of proteins in the Escherichia coli genome are found to belong to the GPGTF superfamily.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Bacterial Outer Membrane Proteins*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / metabolism
  • Binding Sites
  • Carbohydrate Epimerases / chemistry
  • Carbohydrate Epimerases / metabolism
  • Computational Biology
  • Conserved Sequence
  • Databases, Protein
  • Escherichia coli Proteins*
  • Glycogen Phosphorylase / chemistry*
  • Glycogen Phosphorylase / metabolism
  • Humans
  • Models, Molecular
  • Molecular Sequence Data
  • Multigene Family
  • N-Acetylglucosaminyltransferases / chemistry*
  • N-Acetylglucosaminyltransferases / metabolism
  • Protein Conformation
  • Protein Folding
  • Saccharomyces cerevisiae Proteins / chemistry
  • Saccharomyces cerevisiae Proteins / metabolism
  • Sequence Alignment
  • Sequence Homology, Amino Acid*


  • Bacterial Outer Membrane Proteins
  • Bacterial Proteins
  • Escherichia coli Proteins
  • Saccharomyces cerevisiae Proteins
  • Glycogen Phosphorylase
  • N-Acetylglucosaminyltransferases
  • UDP-N-acetylglucosamine-N-acetylmuramyl-(pentapeptide)pyrophosphoryl-undecaprenol N-acetylglucosamine transferase
  • Carbohydrate Epimerases
  • UDP acetylglucosamine-2-epimerase
  • wecB protein, E coli