Use of a quantitative structure-property relationship to design larger model proteins that fold rapidly

Protein Eng. 1999 Nov;12(11):909-17. doi: 10.1093/protein/12.11.909.


A quantitative structure-property relationship (QSPR) was used to design model protein sequences that fold repeatedly and relatively rapidly to stable target structures. The specific model was a 125-residue heteropolymer chain subject to Monte Carlo dynamics on a simple cubic lattice. The QSPR was derived from an analysis of a database of 200 sequences by a statistical method that uses a genetic algorithm to select the sequence attributes that are most important for folding and a neural network to determine the corresponding functional dependence of folding ability on the chosen attributes. The QSPR depends on the number of anti-parallel sheet contacts, the energy gap between the native state and quasi-continuous part of the spectrum and the total energy of the contacts between surface residues. Two Monte Carlo procedures were used in series to optimize both the target structures and the sequences. We generated 20 fully optimized sequences and 60 partially optimized control sequences and tested each for its ability to fold in dynamic MC simulations. Although sequences in which either the number of anti-parallel sheet contacts or the energy of the surface residues is non-optimal are capable of folding almost as well as fully optimized ones, sequences in which only the energy gap is optimized fold markedly more slowly. Implications of the results for the design of proteins are discussed.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Factual
  • Monte Carlo Method
  • Neural Networks, Computer
  • Protein Engineering
  • Protein Folding*
  • Proteins / chemistry*
  • Structure-Activity Relationship


  • Proteins