Monte Carlo simulation studies on the prediction of protein folding types from amino acid composition

Biophys J. 1992 Dec;63(6):1523-9. doi: 10.1016/S0006-3495(92)81728-9.

Abstract

In the methodology development for statistical prediction of protein structures, the founders of different methods usually selected different sets of proteins to test their predicted results. Therefore, it is hard to make a fair comparison according to the results they reported. Even if the predictions by different methods are performed for the same set of proteins, there is still such a problem: a method better that the other for one set of proteins would not necessarily remain so when applied to another set of proteins. To tackle this problem, a Monte Carlo simulation method is proposed to establish an objective criterion to measure the accuracy of prediction for the protein folding type. Such an objective accuracy is actually corresponding to the asymptotical limit genereated during the Monte Carlo simulation process. Based on that, it has been found that the average objective accuracy for predicting the all-alpha, all-beta, alpha + beta, and alpha/beta proteins by the least Euclid's distance method (Nakashima, H., K. Nishikawa, and T. Ooi. 1986. J. Biochem. 99:152-162) is 73.0% and that by the least Minkowski's distance method (Chou, P.Y. 1989. Prediction in Protein Structure and the Principles of Protein Conformation. Plenum Press. New York. 549-586) is 70.9%, indicating that the former is better than the latter. However, according to the original reports, the latter claimed a rate of correct prediction with 79.7% but the former with only 70.2%, leading to a completely opposite conclusion. This indicates the necessity of establishing an objective criterion, and a comparison is meaningful only when it is based on the objective criterion. The simulation method and the idea developed here also can be applied to examine any other statistical prediction methods.

MeSH terms

  • Amino Acids / chemistry
  • Evaluation Studies as Topic
  • Monte Carlo Method*
  • Normal Distribution
  • Protein Folding*
  • Proteins / chemistry

Substances

  • Amino Acids
  • Proteins