Low-usage codons in Escherichia coli, yeast, fruit fly and primates

Gene. 1991 Aug 30;105(1):61-72. doi: 10.1016/0378-1119(91)90514-c.


Codon usage is compared between four classes of species, with an emphasis on characterization of low-usage codons. The classes of species analyzed include the bacterium Escherichia coli (ECO), the yeast Saccharomyces cerevisiae (YSC), the fruit fly Drosophila melanogaster (DRO), and several species of primates (PRI) (taken as a group; includes eleven species for which nucleotide sequence data have been reported to GenBank, however, greater than 90% of the sequences were from Homo sapiens). The number of protein-coding sequences analyzed were 968 for ECO, 484 for YSC, 244 for DRO, and 1518 for PRI. Three methods have been used to determine low-usage codons in these species. The first and most common way of assessing codon usage is by summing the number of time codons appear in reading frames of the genome in question. The second way is to examine the distribution of usage in different genes by scoring the number of protein reading frames in which a particular codon does not appear. The third way starts with a similar notion, but instead considers combinations of codons that are missing from the maximum number of genes. These three methods give very similar results. Each species has a unique combination of eight least-used codons, but all species contain the arginine codons, CGA and CGG. The agreement between YSC and PRI is particularly striking as they share six low-usage codons. All six carry the dinucleotide sequence, CG. The eight least-used codons in PRI include all codons that contain the CG dinucleotide sequence. Low-usage codons are clearly avoided in genes encoding abundant proteins for ECO, YSC DRO. In all species, proteins containing a high percentage of low-usage codons could be characterized as cases where an excess of the protein could be detrimental. Low codon usage is relatively insensitive to gross base composition. However, dinucleotide usage can sometimes influence codon usage. This is particularly notable in the case of CG dinucleotides in PRI.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Base Composition / genetics
  • Codon / genetics*
  • Dinucleoside Phosphates / genetics
  • Drosophila melanogaster / genetics*
  • Escherichia coli / genetics*
  • Gene Expression Regulation / genetics
  • Hominidae / genetics*
  • Humans
  • Open Reading Frames / genetics
  • Primates / genetics
  • Proteins / genetics
  • Saccharomyces cerevisiae / genetics*


  • Codon
  • Dinucleoside Phosphates
  • Proteins