Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 15;431(6):1217-1233.
doi: 10.1016/j.jmb.2019.01.024. Epub 2019 Jan 25.

Extent and Origins of Functional Diversity in a Subfamily of Glycoside Hydrolases

Affiliations
Free PMC article

Extent and Origins of Functional Diversity in a Subfamily of Glycoside Hydrolases

Evan M Glasgow et al. J Mol Biol. .
Free PMC article

Abstract

Some glycoside hydrolases have broad specificity for hydrolysis of glycosidic bonds, potentially increasing their functional utility and flexibility in physiological and industrial applications. To deepen the understanding of the structural and evolutionary driving forces underlying specificity patterns in glycoside hydrolase family 5, we quantitatively screened the activity of the catalytic core domains from subfamily 4 (GH5_4) and closely related enzymes on four substrates: lichenan, xylan, mannan, and xyloglucan. Phylogenetic analysis revealed that GH5_4 consists of three major clades, and one of these clades, referred to here as clade 3, displayed average specific activities of 4.2 and 1.2 U/mg on lichenan and xylan, approximately 1 order of magnitude larger than the average for active enzymes in clades 1 and 2. Enzymes in clade 3 also more consistently met assay detection thresholds for reaction with all four substrates. We also identified a subfamily-wide positive correlation between lichenase and xylanase activities, as well as a weaker relationship between lichenase and xyloglucanase. To connect these results to structural features, we used the structure of CelE from Hungateiclostridium thermocellum (PDB 4IM4) as an example clade 3 enzyme with activities on all four substrates. Comparison of the sequence and structure of this enzyme with others throughout GH5_4 and neighboring subfamilies reveals at least two residues (H149 and W203) that are linked to strong activity across the substrates. Placing GH5_4 in context with other related subfamilies, we highlight several possibilities for the ongoing evolutionary specialization of GH5_4 enzymes.

Keywords: glycoside hydrolase; polysaccharide; protein evolution; substrate specificity; synthetic biology.

Conflict of interest statement

Conflict of interest: The authors declare no competing financial interests.

Figures

Figure 1.
Figure 1.
Activity data characteristics. Blue, red, and green represent enzyme performance on lichenan, xylan, and mannan, respectively. (A) Venn diagram showing assay-defined mono-, bi-, and trifunctional phenotypes. (B) Histograms of optimal temperature (top panel) and pH (lower panel) among enzymes with detected activities.
Figure 2.
Figure 2.
Histogram of log10 specific activities for GH5_4 enzymes on lichenan (top panel), xylan (middle) and mannan (lower); bin size = 0.36 log unit. For each panel, solid bars describe raw specific activity values, and open-hatched bars describe the histogram for specific activities corrected for substrate depletion. Measurements that likely indicated lower limits were counted in the top bin for each substrate (3 such examples for lichenan and 4 for xylan).
Figure 3.
Figure 3.
Correlation in enzyme activities on lichenan and xylan. (Main Figure) Yield-normalized log specific activities (see Methods) for each enzyme on lichenan are plotted against corresponding activity on xylan, for all enzymes possessing both activities. Error bars reflect the impact of ± 5% uncertainty in the measured yield on the plotted specific activity. Open symbols indicate that the measurement reflects a lower bound, and a dashed arrow indicates that an upper bound cannot be determined. The solid line and surrounding shaded region illustrate the best-fit line and the corresponding resampling-determined 90% confidence interval. (Inset) Corresponding plot of optimal temperatures for lichenase and xylanase activities. Transparency and jitter is added to each point to aid in visualization of tallies for each bin.
Figure 4.
Figure 4.
Consensus phylogram constructed using catalytic core sequences for GH5 subfamilies 4, 25, 36, 37, 38, 39 and 52. Nodes with <50% support are displayed as polytomies. Major clades within GH5_4 are denoted by yellow, brown and orange highlighting, with varying saturation for each corresponding to subclade partitioning; other subfamilies are denoted by varying shades of gray. Enzyme sequences that were tested for activities are indicated by a three-cell table at the outer edge of the tree. The presence of blue, red, or green circles within each cell indicates lichenase, xylanase, or mannanase activity, respectively, and circle size indicates activity strength. The largest, most intense symbols reflect enzymes with specific activities ranking in the top third. The location of CelE is also highlighted (*).
Figure 5.
Figure 5.
Phylogenetic grouping-focused plots of base-10 logarithm lichenase versus xylanase activities. Symbol colors and types reflect clade membership as indicated, with ‘out’ indicating outgroup enzymes from related subfamilies; green outlines (M) highlight enzymes for which mannanase activity was also detected. Error bars report uncertainty in plotted specific activities caused by propagation of uncertainty in yield measurements, as in Figure 3. For clarity, only uncertainties of this type greater than 0.1 log units are plotted. Specific activity measurements that are likely lower bounds in either lichenase or xylanase values are denoted by symbols with dashed borders. Arrowheads on error bars similarly indicate that an upper limit uncertainty value cannot be determined. Scatter points in the (left-hand) off-scale shaded region exhibited detectable lichenase, but not xylanase activity. Interactive plot features allow inspection of accession code, source organism, measured specific activities and optimum conditions for each enzyme. Specific activities are in U/mg (n.d. = not detectable), optimum temperatures corresponding to reported specific activity measurements are in °C for lichenase, xylanase and mannanase, respectively. Xyloglucanase specific activity measurements were obtained in a separate set of experiments, with all data collected at 30 °C.
Figure 6.
Figure 6.
Structure of CelE from Hungateiclostridium thermocellum (PDB ID 4IM4). Residues previously implicated in substrate binding are highlighted according to extent of conservation throughout GH5. Red, catalytic glutamates (E193 and E16); orange, GH5-wide conserved residues (H148, N192, Y270); green, GH5_4 conserved residues (N72 and W82); slate, variable residue positions (H149, W203, Y273, N351, E360). (A) Surface representation of CelE showing the oligosaccharide binding channel. (B) Expanded view of the binding channel. Oligosaccharide subsite locations were identified through homology-alignment with other GH5_4 crystal structures; in particular, 5D9N was used to identify subsites +1 and +2, and an unpublished cellotriose-bound structure (CBL16772.1) was used for subsites −3, −2, and −1.
Figure 7.
Figure 7.
Distribution and relationship of specific activities and identities of variable binding cleft residues (A) Cumulative distribution plots for all tested enzymes; the plot contours describe the probability that an enzyme selected at random would exceed value provided on the abscissa. Left of breakpoint, probability that an activity would be detected in our assays; right of breakpoint, probability that the activity would exceed a specific value. Blue, red, and green graphs represent measurements on lichenan, xylan and mannan. (B) Schematic of binding channel composition and activity data for each clade. The higher-order phylogenetic tree branches are presented in topological format for clarity (relative tree branch lengths within each subclade are meaningful). Extension of a leaf node through the three bar plots reveals the alignment-determined identities of key binding channel residues corresponding to positions 149, 203 and 273 in CelE. The PDB column notes the location of enzymes with deposited crystallographic data. Cumulative specific activity distribution functions are plotted on the right-hand side of each row; for reference, the contour for all tested enzymes is included as a dashed line in each subplot.

Similar articles

See all similar articles

Publication types

Substances

Feedback