Analysis of the derived amino acid sequences of toxins A and B from Clostridium difficile has identified an extraordinarily large number of repeat amino acid units in the C-terminal regions of the proteins. Nearly one third of each of the proteins consist of repeating units which appear, at least in the case of toxin A, to be responsible for carbohydrate binding. Similar repeat units are also found in the C-terminal region of four glucosyltransferases from Streptococcus mutans and Streptococcus downei, and in four lytic enzymes from Streptococcus pneumoniae and its bacteriophages (HB-3, Cp-1 and Cp-9). In each case the repeats constitute the ligand-binding portion of the respective enzymes. A glucan-binding protein from S. mutans, which lacks enzymatic activity, has similar repeats spanning almost the entire molecule. This family of ligand-binding proteins appears to be of modular design, with one module consisting of a repetitive ligand-binding domain located in the C-terminal region and the other module(s) providing enzymatic functions.