Provenance of SET-domain histone methyltransferases through duplication of a simple structural unit

Cell Cycle. Jul-Aug 2003;2(4):369-76.


SET domains are protein lysine methyltransferases that methylate diverse proteins, such as, histones, Rubisco and cytochrome C. In particular, they play an important role in the dynamics of the eukaryotic chromatin and are present in several chromatin-associated proteins. Recently, structures of several SET domains have been solved, and they contain a conserved fold that is unrelated to previously characterized methyltransferases, which possess either Rossmann fold or SPOUT domains. Phylogenetic and phyletic-profile analysis of the SET domain suggests that it was an evolutionary "invention" of the eukaryotic lineage, with secondary lateral transfers to bacteria. We show that the conserved N- and C- terminal regions, which comprise the core barrel-like module of the SET domain, are symmetric repeats of a simple 3-stranded unit. Furthermore, the two symmetrically arranged repeats contribute to the binding sites for the two substrates of the SET domain. This suggests the SET domain arose from an ancestral dimer of this 3-stranded unit, with each unit probably functioning as generic-ligand binding structure. The divergence between the two repeat units appears to have arisen as a result of their interactions with the central module of the SET domain, which was inserted between the two repeats. One of the repeats appears to have acquired adaptations, which helped it to specialize in AdoMet binding, whereas the second repeat contributed to histone-interaction, and in orienting a crucial active site residue. The central module of the SET domain supplies a critical asparagine to the active site, and its structural features suggest that it may have also arisen from a further duplication of one of the repeats comprising the core barrel. However, it appears to have structurally diverged from the two canonical repeats due to the lack of an obligate dimerization partner. The spatial position of the two repeats in the ancestral dimer appears to have favored the formation of the structural knot typical of the SET domain. A comparable knot is seen in the SPOUT-domain methyltransferases, and this represents a case of convergent evolution of an active-site-associated configuration in two otherwise unrelated classes of methylases. Thus, the SET domain provides a model for the innovation of a complex enzymatic fold through the duplications of a structurally simple non-enzymatic unit.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Bacteria / enzymology
  • Bacteria / genetics
  • Binding Sites / physiology
  • Caenorhabditis elegans / enzymology
  • Caenorhabditis elegans / genetics
  • Crystallography, X-Ray
  • Histone Methyltransferases
  • Histone-Lysine N-Methyltransferase / metabolism*
  • Methylation
  • Models, Molecular
  • Molecular Sequence Data
  • Phylogeny
  • Protein Conformation
  • Protein Methyltransferases
  • Protein Structure, Tertiary / physiology*
  • Sequence Alignment
  • Sequence Homology, Amino Acid


  • Histone Methyltransferases
  • Protein Methyltransferases
  • Histone-Lysine N-Methyltransferase