Consensus engineering, which is replacing amino acids by the most frequently occurring one at their positions in a multiple sequence alignment (MSA), is a known strategy to increase the stability of a protein. The application of this concept to the entire sequence of an enzyme, however, has been tried only a few times mainly because of the problems determining the consensus in highly variable regions. We show that this problem can be solved by replacing such problematic regions by the corresponding sequence of the natural homologue closest to the consensus. When one or a few sub-families are overrepresented in the MSA the consensus sequence is a biased representation of the sequence space. We examine the influence of this bias by constructing three consensus sequences using different MSAs of sucrose phosphorylase (SP). Each consensus enzyme contained about 70 mutations compared to its closest natural homologue and folded correctly and displayed activity on sucrose. Correlation analysis revealed that the family's co-evolution network was kept intact, which is one of the main advantages of full-length consensus design. The consensus enzymes displayed an "average" thermostability, that is, one that is higher than some but not all known representatives. We cautiously present practical rules for the design of consensus sequences, but warn that the measure of success depends on which natural enzyme is used as point of comparison.
Keywords: consensus design; protein engineering; protein stability; sequence correlation; sucrose phosphorylase.
Copyright © 2013 Wiley Periodicals, Inc.