Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
, 27 (10), 946-50

Automated Design of Synthetic Ribosome Binding Sites to Control Protein Expression

Affiliations

Automated Design of Synthetic Ribosome Binding Sites to Control Protein Expression

Howard M Salis et al. Nat Biotechnol.

Abstract

Microbial engineering often requires fine control over protein expression--for example, to connect genetic circuits or control flux through a metabolic pathway. To circumvent the need for trial and error optimization, we developed a predictive method for designing synthetic ribosome binding sites, enabling a rational control over the protein expression level. Experimental validation of >100 predictions in Escherichia coli showed that the method is accurate to within a factor of 2.3 over a range of 100,000-fold. The design method also correctly predicted that reusing identical ribosome binding site sequences in different genetic contexts can result in different protein expression levels. We demonstrate the method's utility by rationally optimizing protein expression to connect a genetic sensor to a synthetic circuit. The proposed forward engineering approach should accelerate the construction and systematic optimization of large genetic systems.

Figures

Figure 1
Figure 1
A thermodynamic model of bacterial translation initiation. (A) The ribosome translates an mRNA transcript and produces a protein in a four step process: the rate-limiting assembly of the 30S pre-initiation complex, translation initiation, translation elongation, translation termination, and the turnover of ribosomal subunits and other factors. (B) The thermodynamic free energy change during the translation initiation step is determined by five molecular interactions that participate in the initial and final states of the system. See text for a description of each free energy term. The Watson-Crick base pairs and G:U wobbles (red lines) are shown.
Figure 2
Figure 2
The design method has two modes of operation: (A) The method can predict the relative translation initiation rate of an existing RBS when placed in front of a protein coding sequence. The method calculates the ΔGtot from the input sequence. According to Equation 1, a linear relationship between the log protein fluorescence and the predicted ΔGtot is expected. (B) The fluorescence levels from 28 natural or existing RBSs in front of the RFP fluorescent protein are measured (circles) and compared to the predicted ΔGtot calculations. The error bars are calculated as the standard deviation of 6 measurements performed on two different days. The expected relationship is obtained (line, R2 = 0.54) with a slope β = 0.45 ± 0.05. (C) A histogram shows the distribution of error in the predicted ΔGtot, denoted by |ΔΔG|, of the sequences in B. The average of this distribution is 2.11 kcal/mol. (D) An optimization algorithm with Metropolis criteria, the sequence constraints, and simulated annealing uses iterations of mutation and selection to identify an RNA sequence that is predicted to have the target ΔGtot. (E) The fluorescence levels from 29 synthetic RBSs in front of RFP are measured (circles) and compared to the predicted ΔGtot calculations. The error bars are calculated as the standard deviation of at least 5 measurements performed on 2 different days. The expected linear relationship between log protein expression level and predicted ΔGtot is shown (line, R2 = 0.84) with slope β = 0.45 ± 0.01. (F) A histogram shows the distribution of the error, |ΔΔG|. The average of the distribution is 1.82 kcal/mol and fits well to a one-sided Gaussian distribution (red line) with standard deviation σ = 2.44 kcal/mol.
Figure 3
Figure 3
The design method can control the expression level of different proteins by predicting the impact of changing the protein coding sequence. (A) The fluorescence levels from 23 synthetic RBSs in front of two different protein coding sequences are measured and compared to the predicted ΔGtot calculations. The two proteins are TetR27-RFP (diamonds) and AraC27-RFP (squares). The expected relationship between the log protein fluorescence and the predicted ΔGtot is obtained for each protein coding sequence (TetR27-RFP, R2=0.54; AraC27-RFP, R2 = 0.95). (B) Reusing the same RBS sequence with two different protein coding sequences can alter the translation initiation. Fluorescence levels from identical RBS sequences in front of either RFP (white bars) or a chimeric fluorescent protein (either LacI27-RFP, TetR27-RFP, or AraC27-RFP; black bars) are shown. (C) The design method must use the correct protein coding sequence to accurately predict the ΔGtot. The fluorescence levels from 14 pairs of RBS sequences in front of either RFP (black circles) or a chimeric fluorescent protein (LacI27-RFP, triangles; TetR27-RFP, diamonds; AraC27-RFP, squares) are measured. When the correct protein coding sequence is used to calculate the ΔGtot, the expected relationship between log protein fluorescence and ΔGtot is obtained (lines, R2 = 0.62 and R2 = 0.51). Otherwise, the thermodynamic model does not correctly predict the expression level (R2 = 0.04 and 0.02). The error bars calculated as the standard deviation of at least 6 measurements performed on 2 different days.
Figure 4
Figure 4
Optimal connection of a sensor input to an AND gate genetic circuit. (A) A functional AND gate genetic circuit will only turn on the gfp reporter output when both the PBAD and Psal promoter inputs are sufficiently induced by arabinose and salicylate, respectively. (B) The quantitative model and design method predict a fitness curve F(ΔGtot) (blue line), relating the predicted ΔGtot of the PBAD promoter's RBS sequence to the quality of the genetic circuit's AND logic. The accuracy of this curve is tested by assaying the fitness of nine genetic circuit variants, each containing a synthetic RBS that was designed to possess a selected ΔGtot (black circles). (C) The amount of gfp fluorescence is shown in response to combinations of arabinose (0.0, 1.3×10-3, 8.3×10-2, and 1.3 mM) and salicylate (0.0, 6.1×10-4, 3.9×10-2, and 0.62 mM) for selected AND gate genetic circuits. These genetic circuits contain RBS sequences with predicted ΔGtot's of 12.3, 2.18, 0.60, and −1.48 kcal/mol. The error bars calculated as the standard deviation of 2 measurements of fitness performed on 2 different days.

Similar articles

See all similar articles

Cited by 470 articles

See all "Cited by" articles

References

    1. Basu S, Gerchman Y, Collins CH, Arnold FH, Weiss R. A synthetic multicellular system for programmed pattern formation. Nature. 2005;434:1130–1134. - PubMed
    1. Stricker J, et al. A fast, robust and tunable synthetic gene oscillator. Nature. 2008;456:516–519. - PMC - PubMed
    1. Friedland AE, et al. Synthetic gene networks that count. Science. 2009;324:1199–1202. - PMC - PubMed
    1. Ellis T, Wang X, Collins JJ. Diversity-based, model-guided construction of synthetic gene networks with predicted functions. Nat Biotechnol. 2009;27:465–471. - PMC - PubMed
    1. Yokobayashi Y, Weiss R, Arnold FH. Directed evolution of a genetic circuit. Proc Natl Acad Sci U S A. 2002;99:16587–16591. - PMC - PubMed

Publication types

LinkOut - more resources

Feedback