A joint prediction of the folding types of 1490 human proteins from their genetic codons

J Theor Biol. 1993 Mar 21;161(2):251-62. doi: 10.1006/jtbi.1993.1053.


The codon usages for 1490 human proteins have been published by Wada et al. (1990). Based on these data, the frequencies of occurrence of 20 amino acids for each of the 1490 proteins have been calculated according to the genetic codes. Proteins are generally classified into five folding types, i.e. the alpha, beta, alpha + beta, alpha/beta and zeta (irregular) types. The folding type of a protein is correlated to its amino acid composition. By means of three methods established by different investigators, the folding type for each of the 1490 human proteins has been predicted. It has been demonstrated that the accuracy of prediction for the 1490 human proteins is at least 80% by examining the predicted results of some structurally known proteins with these methods. There are only six proteins for which there is uncertainty about their folding types as completely inconsistent results were obtained when predicted with the three different methods. For the remaining 1484 human proteins the numbers of alpha, beta, alpha + beta, alpha/beta, and zeta folding type proteins were found to be 128, 235, 169, 933 and 19, respectively, suggesting that the alpha/beta type proteins would predominate in this set of human proteins. The occurrence frequencies of bases in the first, second and third codon position for each folding type of protein have been calculated. It is shown that the folding type of a protein is strongly dependent on the ratio of frequency of base G in the first codon position with that in the second codon position. The biological implication of the results has been discussed.

MeSH terms

  • Amino Acid Sequence
  • Codon*
  • Forecasting / methods
  • Humans
  • Protein Folding*
  • Proteins / genetics*


  • Codon
  • Proteins