Predicting protein folding types by distance functions that make allowances for amino acid interactions

J Biol Chem. 1994 Sep 2;269(35):22014-20.


Given the amino acid composition of a protein, how may one predict its folding type? Although around this problem a number of methods have been proposed, none of them has taken into account the correlative effect among different amino acids, and hence the accuracy of prediction could not be improved to the extent that it should have. In view of this, a new method has been developed in which the similarity between two protein molecules is based on the scale of Mahalanobis distance rather than on the ordinary intuitive geometric distances, such as Minkowski's distance and Euclidian distance. By introducing the Mahalanobis distance, the correlative effect among different amino acids can be automatically incorporated. Predictions have been performed for 131 real proteins consisting of alpha, beta, alpha+beta, and alpha/beta proteins. The results indicate that the rates of correct prediction for both alpha and beta proteins are 100%, and those for alpha+beta and alpha/beta are 88.9 and 89.7%, respectively, with an average accuracy of 94.7%. Predictions have also been performed for 10,000 simulated proteins generated by Monte Carlo sampling for each of the above four folding types, yielding an average accuracy of 95.9%. The accuracy thus obtained for the simulated proteins can avoid the bias due to the limited number of testing proteins selected arbitrarily by different investigators and hence can be regarded as an objective accuracy. It is anticipated that a method with such a high objective accuracy should become a reliable tool in predicting the protein folding type and a useful tool for improving the prediction of secondary structure as well.

MeSH terms

  • Models, Chemical
  • Protein Folding*