Codon usage in vertebrates is associated with a low risk of acquiring nonsense mutations

J Transl Med. 2011 Jun 8:9:87. doi: 10.1186/1479-5876-9-87.

Abstract

Background: Codon usage in genomes is biased towards specific subsets of codons. Codon usage bias affects translational speed and accuracy, and it is associated with the tRNA levels and the GC content of the genome. Spontaneous mutations drive genomes to a low GC content. Active cellular processes are needed to maintain a high GC content, which influences the codon usage of a species. Loss-of-function mutations, such as nonsense mutations, are the molecular basis of many recessive alleles, which can greatly affect the genome of an organism and are the cause of many genetic diseases in humans.

Methods: We developed an event based model to calculate the risk of acquiring nonsense mutations in coding sequences. Complete coding sequences and genomes of 40 eukaryotes were analyzed for GC and CpG content, codon usage, and the associated risk of acquiring nonsense mutations. We included one species per genus for all eukaryotes with available reference sequence.

Results: We discovered that the codon usage bias detected in genomes of high GC content decreases the risk of acquiring nonsense mutations (Pearson's r = -0.95; P < 0.0001). In the genomes of all examined vertebrates, including humans, this risk was lower than expected (0.93 ± 0.02; mean ± SD) and lower than the risk in genomes of non-vertebrates (1.02 ± 0.13; P = 0.019).

Conclusions: While the maintenance of a high GC content is energetically costly, it is associated with a codon usage bias harboring a low risk of acquiring nonsense mutations. The reduced exposure to this risk may contribute to the fitness of vertebrates.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / genetics
  • Animals
  • Base Composition / genetics
  • Base Sequence
  • Codon, Nonsense / genetics*
  • CpG Islands / genetics
  • Genetic Code / genetics
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Risk Factors
  • Species Specificity
  • Vertebrates / genetics*

Substances

  • Amino Acids
  • Codon, Nonsense