Predicting protein stability changes from sequences using support vector machines

Bioinformatics. 2005 Sep 1:21 Suppl 2:ii54-8. doi: 10.1093/bioinformatics/bti1109.


Motivation: The prediction of protein stability change upon mutations is key to understanding protein folding and misfolding. At present, methods are available to predict stability changes only when the atomic structure of the protein is available. Methods addressing the same task starting from the protein sequence are, however, necessary in order to complete genome annotation, especially in relation to single nucleotide polymorphisms (SNPs) and related diseases.

Results: We develop a method based on support vector machines that, starting from the protein sequence, predicts the sign and the value of free energy stability change upon single point mutation. We show that the accuracy of our predictor is as high as 77% in the specific task of predicting the DeltaDeltaG sign related to the corresponding protein stability. When predicting the DeltaDeltaG values, a satisfactory correlation agreement with the experimental data is also found. As a final blind benchmark, the predictor is applied to proteins with a set of disease-related SNPs, for which thermodynamic data are also known. We found that our predictions corroborate the view that disease-related mutations correspond to a decrease in protein stability.


Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acid Substitution
  • Artificial Intelligence*
  • Computer Simulation
  • Models, Chemical*
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods*
  • Protein Denaturation
  • Protein Folding
  • Proteins / chemistry*
  • Sequence Alignment / methods
  • Sequence Analysis, Protein / methods*
  • Structure-Activity Relationship


  • Proteins