Gene sequence and properties of CelI, a family E endoglucanase from Clostridium thermocellum

J Gen Microbiol. 1993 Feb;139(2):307-16. doi: 10.1099/00221287-139-2-307.


The Clostridium thermocellum celI gene, coding for endoglucanase I (CelI), consists of an open reading frame (ORF) of 2640 nucleotides and codes for a protein of M(r) 98531. The ORF was confirmed as celI by comparing the N-terminal sequence of purified recombinant CelI with that deduced from the nucleotide sequence. CelI hydrolysed lichenan and carboxymethylcellulose, but was principally active against barley beta-glucan. It exhibited significant sequence identity with subfamily E2 endoglucanases, and by analogy with others in this group contains a catalytic domain of around 500 residues located in the N-terminal half of the protein. The C-terminal region of CelI was highly homologous with the cellulose-binding domain of the non-catalytic cellulosome subunit, S1. A repeated segment, previously shown to be highly conserved in xylanase Z and in other endoglucanases from C. thermocellum, was absent from CelI. Antiserum raised against purified recombinant CelI cross-reacted with proteins contained in the cellulosomes of two strains of C. thermocellu, suggesting that CelI is either a component of the cellulosome or is homologous to other cellulosome proteins. A second gene, located upstream of celI, consisted of an ORF of 1671 nucleotides, coding for a protein of M(r) 61042. Based on its homology with the Escherichia coli tar gene product, the polypeptide encoded by the second gene is tentatively identified as a sensory transducer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Cellulose 1,4-beta-Cellobiosidase
  • Cloning, Molecular
  • Clostridium / enzymology
  • Clostridium / genetics*
  • DNA, Bacterial / genetics
  • Genes, Bacterial
  • Glucans / metabolism
  • Glycoside Hydrolases / chemistry
  • Glycoside Hydrolases / genetics*
  • Glycoside Hydrolases / metabolism
  • Molecular Sequence Data
  • Molecular Weight
  • Open Reading Frames
  • Sequence Homology, Amino Acid
  • Substrate Specificity


  • DNA, Bacterial
  • Glucans
  • Glycoside Hydrolases
  • Cellulose 1,4-beta-Cellobiosidase

Associated data

  • GENBANK/L04735
  • GENBANK/L04736