Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 19:6:421.
doi: 10.1038/msb.2010.78.

Impact of translational error-induced and error-free misfolding on the rate of protein evolution

Affiliations

Impact of translational error-induced and error-free misfolding on the rate of protein evolution

Jian-Rong Yang et al. Mol Syst Biol. .

Abstract

What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects for higher translational robustness of more abundant proteins, which constrains sequence evolution. However, the impact of error-free protein misfolding has not been evaluated. We estimate that a non-negligible fraction of misfolded proteins are error free and demonstrate by a molecular-level evolutionary simulation that selection against protein misfolding results in a greater reduction of error-free misfolding than error-induced misfolding. Thus, an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the translational robustness hypothesis. We show that misfolding-minimizing amino acids are preferentially used in highly abundant yeast proteins and that these residues are evolutionarily more conserved than other residues of the same proteins. These findings provide unambiguous support to the role of protein-misfolding-avoidance in determining the rate of protein sequence evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
Sources of misfolded proteins. The translational robustness hypothesis considers only translational error-induced misfolding (arrow 6), whereas the overarching protein-misfolding-avoidance hypothesis considers both error-induced misfolding (arrow 6) and error-free misfolding (arrow 4).
Figure 2
Figure 2
A molecular-level evolutionary simulation for examining the roles of error-induced and error-free misfolding in generating the anticorrelation between protein expression level and evolutionary rate. (A) The general scheme of the simulation. Simulations are conducted under error-induced misfolding only (B–E), error-free misfolding only (F–I), or both types of misfolding (J–M). In all cases, after 100 000 generations of evolution, protein unfolding energy ΔG is highly positively correlated with the gene expression level (B, F, and J); the probability of protein misfolding is highly negatively correlated with the gene expression level (C, G, and K); and the number of fixed amino acid changes per sequence per 50 000 generations is highly negatively correlated with ΔG (D, H, and L) and gene expression level (E, I, and M). Correlation coefficients and significance levels are determined by Spearman's rank correlation tests. The red lines in panels B–M are estimated using locally weighted scatterplot smoothing.
Figure 3
Figure 3
Amount of error-free and error-induced protein misfolding found in computer simulations when both sources of misfolding are considered. The probabilities of error-free misfolding (A) and error-induced misfolding (B) and the fraction of misfolded molecules that are error-free (C) all decrease with the rise of the gene expression level. (D) The rate of translational error per protein decreases as the gene expression level increases. (E) The destabilizing effect (−ΔΔG) per translational error increases with gene expression level. (F) The total destabilization effect of mistranslation increases with gene expression level. Correlation coefficients and significance levels are determined by Spearman's rank correlation tests. The red lines are estimated using locally weighted scatterplot smoothing.
Figure 4
Figure 4
Codons minimizing the probability of protein misfolding are used more frequently in highly expressed yeast genes than in lowly expressed genes. (A) An example showing the relative protein misfolding probability (pmisfold) of the yeast gene YDR071C (encoding polyamine acetyltransferase) and those of its 60 mutants that each have the 164th codon of the gene replaced by one of the other 60 sense codons. Note that pmisfold is the misfolding probability of a mutant gene relative to that of the wild-type gene. The wild-type codon at this position is marked in blue. Bars are boxed for preferred synonymous codons and unboxed for unpreferred synonymous codons of each amino acid. The inset is an enlarged figure that better shows small differences in pmisfold among some synonymous mutants. (B) The fraction (fmatching codon) of wild-type codons that match the codons with the smallest pmisfold is positively correlated with protein expression level. (C) The fraction (fmatching aa) of wild-type amino acids that match the amino acids encoded by the codons with the smallest pmisfold is positively correlated with protein expression level. In both B and C, genes are separated into 10 equal-size bins. The expression ranges of the 10 bins are [49.2, 358], (358, 688], (688, 1140], (1140, 1630], (1630, 2250], (2250, 3130], (3130, 4870], (4870, 7720], (7720, 18 000], and (18 000, 1 260 000], respectively. Error bars indicate standard errors. Correlations and P-values are estimated from unbinned data, using Spearman's rank correlation tests. (D) Comparison of fmatching codon between paralogous genes in yeast. Red dots show gene pairs with at least a 20-fold expression difference, whereas gray dots show gene pairs with a <20-fold expression difference. There are significantly more red dots above the diagonal line than expected by chance (P=3.72 × 10−3, binomial test). (E) Comparison of fmatching aa between paralogous genes in yeast. Colors have the same meanings as in D. There are significantly more red dots over the diagonal line than expected by chance (P=0.0357, binomial test).
Figure 5
Figure 5
Evolutionary conservation of amino acid residues correlates with the mutational sensitivity to misfolding. Proteins with at least three varied sites are considered. (A) An example (codon no. 58 of YAL001C) showing the measurement of the mutational sensitivity (S) of a codon, which is defined by the mean pmisfold of its one nonsynonymous mutation neighbors indicated by the dotted red line. Here, pmisfold is the protein misfolding probability of a mutant relative to that of the wild-type gene. The nucleotide differences from the wild-type as well as the altered amino acids are colored in red. (B) Fraction of genes with Sconserved>Svaried increases significantly with expression level. Here, Sconserved and Svaried are the mean S-values for codons with conserved and varied amino acids between S. cerevisiae and S. paradoxus orthologs, respectively. The genes are grouped into 100 equal-size bins according to the yeast protein expression level. (C) The ratio of Sconserved and Svaried within a gene is positively correlated with its expression level. In B and C, correlation coefficients and significance levels are determined by Spearman's rank correlation tests.

Similar articles

Cited by

References

    1. Akashi H (1994) Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136: 927–935 - PMC - PubMed
    1. Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA 99: 3695–3700 - PMC - PubMed
    1. Bava KA, Gromiha MM, Uedaira H, Kitajima K, Sarai A (2004) ProTherm, version 4.0: thermodynamic database for proteins and mutants. Nucleic Acids Res 32: D120–D121 - PMC - PubMed
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28: 235–242 - PMC - PubMed
    1. Boas FE, Harbury PB (2007) Potential energy functions for protein design. Curr Opin Struct Biol 17: 199–204 - PubMed

Publication types

LinkOut - more resources