Co-Evolutionary Fitness Landscapes for Sequence Design

Angew Chem Int Ed Engl. 2018 May 14;57(20):5674-5678. doi: 10.1002/anie.201713220. Epub 2018 Mar 25.

Abstract

Efficient and accurate models to predict the fitness of a sequence would be extremely valuable in protein design. We have explored the use of statistical potentials for the coevolutionary fitness landscape, extracted from known protein sequences, in conjunction with Monte Carlo simulations, as a tool for design. As proof of principle, we created a series of predicted high-fitness sequences for three different protein folds, representative of different structural classes: the GA (all-α) and GB (α/β) binding domains of streptococcal protein G, and an SH3 (all-β) domain. We found that most of the designed proteins can fold stably to the target structure, and a structure for a representative of each for GA, GB and SH3 was determined. Several of our designed proteins were also able to bind to native ligands, in some cases with higher affinity than wild-type. Thus, a search using a statistical fitness landscape is a remarkably effective tool for finding novel stable protein sequences.

Keywords: biophysics; coevolution; computations; protein design; statistical mechanics.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Bacterial Proteins / chemical synthesis*
  • Bacterial Proteins / chemistry
  • Models, Molecular
  • Monte Carlo Method
  • Protein Conformation
  • Protein Folding

Substances

  • Bacterial Proteins
  • IgG Fc-binding protein, Streptococcus