Graphic analysis of codon usage strategy in 1490 human proteins

J Protein Chem. 1993 Jun;12(3):329-35. doi: 10.1007/BF01028195.


The frequencies of bases A (adenine), C (cytosine), G (guanine), and T (thymine) occurring in codon position i, denoted by ai, ci, gi, and ti, respectively (i = 1,2,3), have been calculated and diagrammatized for the 1490 human proteins in the codon usage table for primate genes compiled recently. Based on the characteristic graphs thus obtained, an overall picture of codon base distribution has been provided, and the relevant biological implication discussed. For the first codon position, it is shown in most cases that G is the most dominant base, and that the relationship g1 > a1 > c1 > t1 generally holds true. For the second codon position, A is generally the most dominant base and G is the one with the least occurrence frequently, with the relationship of a2 > t2 > c2 > g2. As to the third codon position, the values of g3 + c3 vary from 0.27 to 1, roughly keeping the relationship of c3 > g3 > a3 = t3 for the majority of cases. Interestingly, if the average frequencies for bases A, C, G, and T are defined as a = (a1 + a2 + a3)/3, c = (c1 + c2 + c3)/3, g = (g1 + g2 + g3)/3, and t = (t1 + t2 + t3)/3, respectively, we find that a2 + c2 + g2 + t2 < 1/3 is valid almost without exception. Such a characteristic inequality might reflect some inherent rule of codon usage, although its biological implications is unclear.(ABSTRACT TRUNCATED AT 250 WORDS)

MeSH terms

  • Adenine / physiology
  • Codon / genetics*
  • Cytosine / physiology
  • DNA / genetics
  • Guanine / physiology
  • Humans
  • Mathematical Computing
  • Models, Genetic*
  • Proteins / genetics*
  • Thymine / physiology


  • Codon
  • Proteins
  • Guanine
  • Cytosine
  • DNA
  • Adenine
  • Thymine