Evolutionary bases of carbohydrate recognition and substrate discrimination in the ROK protein family

J Mol Evol. 2010 Jun;70(6):545-56. doi: 10.1007/s00239-010-9351-1. Epub 2010 May 30.


The ROK (repressor, open reading frame, kinase) protein family (Pfam 00480) is a large collection of bacterial polypeptides that includes sugar kinases, carbohydrate responsive transcriptional repressors, and many functionally uncharacterized gene products. ROK family sugar kinases phosphorylate a range of structurally distinct hexoses including the key carbon source D: -glucose, various glucose epimers, and several acetylated hexosamines. The primary sequence elements responsible for carbohydrate recognition within different functional categories of ROK polypeptides are largely unknown due to a limited structural characterization of this protein family. In order to identify the structural bases for substrate discrimination in individual ROK proteins, and to better understand the evolutionary processes that led to the divergent evolution of function in this family, we constructed an inclusive alignment of 227 representative ROK polypeptides. Phylogenetic analyses and ancestral sequence reconstructions of the resulting tree reveal a discrete collection of active site residues that dictate substrate specificity. The results also suggest a series of mutational events within the carbohydrate-binding sites of ROK proteins that facilitated the expansion of substrate specificity within this family. This study provides new insight into the evolutionary relationship of ROK glucokinases and non-ROK glucokinases (Pfam 02685), revealing the primary sequence elements shared between these two protein families, which diverged from a common ancestor in ancient times.

MeSH terms

  • Amino Acid Sequence
  • Binding Sites / genetics
  • Binding Sites / physiology
  • Computational Biology
  • Databases, Protein
  • Evolution, Molecular*
  • Molecular Sequence Data
  • Phylogeny
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism*
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Substrate Specificity / genetics
  • Substrate Specificity / physiology


  • Proteins