Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Mar 22;13:43.
doi: 10.1186/1471-2105-13-43.

Codon Deviation Coefficient: A Novel Measure for Estimating Codon Usage Bias and Its Statistical Significance

Free PMC article

Codon Deviation Coefficient: A Novel Measure for Estimating Codon Usage Bias and Its Statistical Significance

Zhang Zhang et al. BMC Bioinformatics. .
Free PMC article


Background: Genetic mutation, selective pressure for translational efficiency and accuracy, level of gene expression, and protein function through natural selection are all believed to lead to codon usage bias (CUB). Therefore, informative measurement of CUB is of fundamental importance to making inferences regarding gene function and genome evolution. However, extant measures of CUB have not fully accounted for the quantitative effect of background nucleotide composition and have not statistically evaluated the significance of CUB in sequence analysis.

Results: Here we propose a novel measure--Codon Deviation Coefficient (CDC)--that provides an informative measurement of CUB and its statistical significance without requiring any prior knowledge. Unlike previous measures, CDC estimates CUB by accounting for background nucleotide compositions tailored to codon positions and adopts the bootstrapping to assess the statistical significance of CUB for any given sequence. We evaluate CDC by examining its effectiveness on simulated sequences and empirical data and show that CDC outperforms extant measures by achieving a more informative estimation of CUB and its statistical significance.

Conclusions: As validated by both simulated and empirical data, CDC provides a highly informative quantification of CUB and its statistical significance, useful for determining comparative magnitudes and patterns of biased codon usage for genes or genomes with diverse sequence compositions.


Figure 1
Figure 1
Codon usage bias across a variety of positional background nucleotide compositions. Heterogeneous positional background compositions were considered for GC content (panels A to C) and purine content (panels D to E), respectively. The expected values of codon usage bias are zero for all examined cases.
Figure 2
Figure 2
Codon usage bias across a range of sequence lengths. Sequences were simulated with the four non-uniform positional composition sets: Low (panel A), Med-1 (panel B), Med-2 (panel C) and High (panel D). Each estimate was determined based on 10000 replicate simulated sequences. The expected values of codon usage bias are zero for all examined cases.
Figure 3
Figure 3
Heterogeneity of positional background nucleotide compositions in E. coli (2,766 genes in M9 medium), S. cerevisiae (5,142 genes), D. melanogaster (1,651 genes),C. elegans (12,184 genes), and A. thaliana (1,332 genes). Heterogeneities of positional GC contents are represented by absolute differences between overall GC content and its positional contents: GC-GC1 for the first position (panel A), GC-GC2 for the second position (panel B), and GC-GC3 for the third position (panel C), respectively. Likewise, heterogeneities of positional purine content are absolute differences between overall purine (AG) content and its positional contents: AG-AG1 for the first position (panel D), AG-AG2 for the second position (panel E), and AG-AG3 for the third position (panel F), respectively.
Figure 4
Figure 4
Comparison of CDC distributions between ribosomal protein (54 RP genes vary from 0.244 to 0.481) genes and all genes (4,144 genes range from 0.046 to 0.550) in E. coli.

Similar articles

See all similar articles

Cited by 22 articles

See all "Cited by" articles


    1. Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129(3):897–907. - PMC - PubMed
    1. Akashi H. Codon bias evolution in Drosophila. Population genetics of mutation-selection drift. Gene. 1997;205(1-2):269–278. doi: 10.1016/S0378-1119(97)00400-9. - DOI - PubMed
    1. Chen SL, Lee W, Hottes AK, Shapiro L, McAdams HH. Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci USA. 2004;101(10):3480–3485. doi: 10.1073/pnas.0307827100. - DOI - PMC - PubMed
    1. Hershberg R, Petrov DA. Selection on codon bias. Annu Rev Genet. 2008;42:287–299. doi: 10.1146/annurev.genet.42.110807.091442. - DOI - PubMed
    1. Plotkin JB, Kudla G. Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet. 2011;12(1):32–42. doi: 10.1038/nrg2899. - DOI - PMC - PubMed

Publication types

LinkOut - more resources