Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;8(2):e56642.
doi: 10.1371/journal.pone.0056642. Epub 2013 Feb 25.

Causes and implications of codon usage bias in RNA viruses

Affiliations

Causes and implications of codon usage bias in RNA viruses

Ilya S Belalov et al. PLoS One. 2013.

Abstract

Choice of synonymous codons depends on nucleotide/dinucleotide composition of the genome (termed mutational pressure) and relative abundance of tRNAs in a cell (translational pressure). Mutational pressure is commonly simplified to genomic GC content; however mononucleotide and dinucleotide frequencies in different genomes or mRNAs may vary significantly, especially in RNA viruses. A series of in silico shuffling algorithms were developed to account for these features and analyze the relative impact of mutational pressure components on codon usage bias in RNA viruses. Total GC content was a poor descriptor of viral genome composition and causes of codon usage bias. Genomic nucleotide content was the single most important factor of synonymous codon usage. Moreover, the choice between compatible amino acids (e.g., leucine and isoleucine) was strongly affected by genomic nucleotide composition. Dinucleotide composition at codon positions 2-3 had additional effect on codon usage. Together with mononucleotide composition bias, it could explain almost the entire codon usage bias in RNA viruses. On the other hand, strong dinucleotide content bias at codon position 3-1 found in some viruses had very little effect on codon usage. A hypothetical innate immunity sensor for CpG in RNA could partially explain the codon usage bias, but due to dependence of virus translation upon biased host translation machinery, experimental studies are required to further explore the source of dinucleotide bias in RNA viruses.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Effective number of codons in the original and shuffled sequences.
(A) Explanation of shuffling algorithms. (B) Effective number of codons in the original virus sequence and the mean ENC of 1000 sequences randomized or scrambled using different algorithms. Virus name acronyms are provided in Table 1.
Figure 2
Figure 2. CpG content at different codon positions in RNA viruses.
(A) Relative CpG dinucleotide content at codon positions 2-3 and 3-1 in RNA viruses (Table 1) was calculated as a ratio of the observed CpG content to the expected, calculated as the product of genomic nucleotide frequencies at the corresponding codon positions, e.g. £ [C2]×£ [G3] for CpG23 dinucleotide. (B) Mean relative CpG dinucleotide content in extended datasets of five RNA viruses. White, codon positions 2-3; gray, codon positions 3-1. Error bars indicate standard errors of the mean. Virus name acronyms are provided in Table 1.
Figure 3
Figure 3. Plots of ENC vs. sequence content in 29 RNA viruses indicated in Table 1 (black dots) and in 1000 simulated sequences with random third-position nucleotide content (gray circles).
Plot of ENC against GC content (A), variance of third-position GC content (B), variance of third-position nucleotide frequencies (D) and variance of dinucleotide frequencies at codon position 2-3 (D). Solid line in panel (a) indicates theoretical prediction of ENC as a function of GC content bias .
Figure 4
Figure 4. Correlation between sequence content at synonymous positions and sequence content at non synonymous positions/encoded protein content.
Correlation between C content at codon positions 1+2 vs. 3 (A); ratio of isoleucine to leucine to third-position A content (B); ratio of arginine to glycine content to third-position G content (C) in 29 animal RNA viruses (Table 1).

Similar articles

Cited by

References

    1. Bulmer M (1987) Coevolution of codon usage and transfer RNA abundance. Nature 325: 728–730. - PubMed
    1. Pan A, Dutta C, Das J (1998) Codon usage in highly expressed genes of Haemophillus influenzae and Mycobacterium tuberculosis: translational selection versus mutational bias. Gene 215: 405–413. - PubMed
    1. Gutierrez G, Marquez L, Marin A (1996) Preference for guanosine at first codon position in highly expressed Escherichia coli genes. A relationship with translational efficiency. Nucleic Acids Res 24: 2525–2527. - PMC - PubMed
    1. Gouy M, Gautier C (1982) Codon usage in bacteria: correlation with gene expressivity. Nucleic Acids Res 10: 7055–7074. - PMC - PubMed
    1. Thanaraj TA, Argos P (1996) Ribosome-mediated translational pause and protein domain organization. Protein Sci 5: 1594–1612. - PMC - PubMed

Publication types

Grants and funding

This work was supported by Deutsche Forschungsgemeinschaft [grant DR772/2-1]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources