Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep;585(7823):124-128.
doi: 10.1038/s41586-020-2638-5. Epub 2020 Aug 26.

Functionally uncoupled transcription-translation in Bacillus subtilis

Affiliations

Functionally uncoupled transcription-translation in Bacillus subtilis

Grace E Johnson et al. Nature. 2020 Sep.

Abstract

Tight coupling of transcription and translation is considered a defining feature of bacterial gene expression1,2. The pioneering ribosome can both physically associate and kinetically coordinate with RNA polymerase (RNAP)3-11, forming a signal-integration hub for co-transcriptional regulation that includes translation-based attenuation12,13 and RNA quality control2. However, it remains unclear whether transcription-translation coupling-together with its broad functional consequences-is indeed a fundamental characteristic of bacteria other than Escherichia coli. Here we show that RNAPs outpace pioneering ribosomes in the Gram-positive model bacterium Bacillus subtilis, and that this 'runaway transcription' creates alternative rules for both global RNA surveillance and translational control of nascent RNA. In particular, uncoupled RNAPs in B. subtilis explain the diminished role of Rho-dependent transcription termination, as well as the prevalence of mRNA leaders that use riboswitches and RNA-binding proteins. More broadly, we identified widespread genomic signatures of runaway transcription in distinct phyla across the bacterial domain. Our results show that coupled RNAP-ribosome movement is not a general hallmark of bacteria. Instead, translation-coupled transcription and runaway transcription constitute two principal modes of gene expression that determine genome-specific regulatory mechanisms in prokaryotes.

PubMed Disclaimer

Figures

Extended Data Figure 1.
Extended Data Figure 1.. Transcription and translation kinetics in slow growth.
Induction time course of lacZ mRNA (top) and protein (bottom) as in Fig. 1b, d for WT B. subtilis grown in MOPS minimal media + 0.4% maltose (growth rate 0.65 h−1). Lines indicate linear fits after signals rise. Uncertainties are standard error of the mean (SEM) among biological replicates (2).
Extended Data Figure 2.
Extended Data Figure 2.. Validation of β-gal assay.
a, Measurement of linear range of microplate reader. Fluorescence relative to input of dilutions of an induced culture of bGJ74 (full-length lacZ) at steady-state. See Methods. b, Effect of different stop solutions on stopping translation. Induction time courses of pycA-lacZα protein collected into a stop solution containing chloramphenicol and erythromycin (grey, all plots, from Fig. 1d) or with either flash freezing in liquid nitrogen (top), 15 μL toluene added to the stop solution (middle), or 50 μL 12.5 mg/mL lincomycin added to the stop solution (bottom), shown in red in each plot (as described in Methods). Lines indicate linear fits after signals rise and τTL is indicated. c, Induction time course of truncated pycA-lacZα mRNA (top) and protein (bottom) as in Fig. 1b, d. Lines indicate linear fits after signals rise. Uncertainties are standard error of the mean (SEM) among biological replicates (2).
Extended Data Figure 3.
Extended Data Figure 3.. Contribution of non-essential RNAP subunits and transcription factors to fast transcription.
Induction time course of pycA-lacZα mRNA in various mutant backgrounds as in Fig. 1b, d. Time course of the same construct in WT from Fig. 1d also shown for reference. Lines indicate linear fits after signals rise. Uncertainties are standard error of the mean (SEM) among biological replicates (1 for ΔykzG and 2 for all others). Time of appearance of full-length mRNA in mutants is not substantially different than that measured in WT (see Supplementary Discussion).
Extended Data Figure 4.
Extended Data Figure 4.. Phylogenetic distribution of domain architecture for NusG, NusA and RpoB.
a. Multiple sequence alignments (Methods) for NusA (602 columns), NusG (325 columns), and the β subunit of the RNAP RpoB (1732 columns) for species shown in Fig. 4. The alignments are visualized in a binary fashion to highlight presence/absence of certain domains: white indicates presence of an amino acid in the alignment, and black indicates presence of a gap. The alignments were trimmed by removing columns with >95% gaps. Species with no homologs, partial or pseudogene homologs, or multiple homologs are shown as grey lines. Phylogenetic tree and fraction of terminators with stop-to-stem distances within 12 nt from Fig. 4 are reproduced in linearized form. The position of domains from the E. coli protein are identified bars above the alignments. For RpoB, conserved bacterial regions identified by (βb1 to βb16) are shown. The NusA C-terminal domain, (orange box) is missing in a large fraction of Firmicutes (partly present in Mollicutes, which include Mycoplasma and Spiroplasma; red brace), Campylobacterota, Thermotogota, Fusobacteria, and Actinobacteria. NusG has a largely conserved domain architecture, with Actinobacteria showing N-terminal extension. As previously noted in detail, the β subunit of the RNAP has multiple insertion domains in diverse bacteria. Insertion domain βSI2, recently implicated (green box) in transcription-translation coupling is lineage-specific and absent in many clades of Gram-positive bacteria, as noted in. Dashed box in tree highlight clade containing Mycoplasma. b. Close-up view of our analysis of the clade containing Mycoplasma (indicated by black dots). Sub-tree includes species with n≤20 identified terminators (marked in light red). Grayscale representation of stop-to-stem distributions and fraction of terminators with d≤12 nt are the same as Fig. 4. M. pneumoniae is highlighted in cyan, and has no identified terminator (0/14) with d≤12 nt. c. Cumulative distribution of stop-to-stem distance for bioinformatically identified terminators in M. pneumoniae.
Extended Data Figure 5.
Extended Data Figure 5.. Details of ORF extension constructs and transcription terminator readthrough vs. stop-to-stem distances.
a, Sequence for terminators T1 and T2 for three variants (T1+: pupG original terminator, T1-: disrupted pupG terminator, ORF extension: original pupG with upstream ORF extended inside the loop of the terminator). For T1 and T2, blue and grey shading respectively marks the position of the terminator hairpin stems, with free energy of folding ΔG indicated. Black stars indicate introduced mutations. Downward carets (▼) indicate the position of the 3’ ends of associated with intrinsic terminators as determined by Rend-seq. Red dashed line indicates the complementary region of the Northern blot probe to the readthrough product. b, Terminator readthrough fraction (defined as the Rend-seq read density after terminator divided by read density upstream of terminator, see for details) as a function of stop-to-stem distance for E. coli intrinsic terminators from Fig. 2 for which readthrough could be reliably estimated (n=392). Terminators with stop-to-stem distance d≤12 nt are highlighted in red. c, Cumulative distribution function of terminator readthrough for terminators far (black, d>12 nt) from and close (red, d≤12 nt) to stop codons. Terminator close to genes have significantly more readthrough (less termination), p<10−3 (q30d>12 and q30d≤12 indicate the 30th percentile in the readthrough distribution for the two categories of terminators, with fold-change F30 := q30d≤12/ q30d>12, p-value determined as the fraction of bootstrap random sub-samplings of the readthrough distributions with q30d>12 > q30d≤12, see Methods) d, Terminator readthrough as a function of ΔGU the U-tract DNA/RNA hybrid free energy (measure of U-tract quality, with larger ΔGU corresponding to U-rich U-tract). Grey shading indicates cutoff (ΔGU>−5 kcal/mol) to select good U-tract terminators. e, Same as c, but restricting to good U-tract terminators, still showing significantly less termination for terminators near ORF, p<10−3 (same as above, see Methods). f-i, same as b-e, but with terminators from B. subtilis. Terminators close to ORF do not show less readthrough than their gene-distal counterparts (p>0.3, p-value determined with same strategy as above, see Methods).
Extended Data Figure 6.
Extended Data Figure 6.. Examples of identified nested antisense RNAs.
B. subtilis shows a number (n=35, see Methods for selection criteria) of mRNAs with long untranslated regions fully encompassing genes in the antisense directions, which we call nested antisense RNAs (also termed non-contiguous operons or excludons). The majority (n=29/35) of these have a fold-change in mRNA level less than two-fold upon rho deletion (Fig. 3b). a, Schematic of a nested antisense RNAs with corresponding Rend-seq signal, with orange peaks and blue peaks marking 5’ and 3’ boundaries of the transcript. b, Representative examples of nested antisense RNAs with mRNA level fold change upon rho deletion less than 2. Rend-seq data (peak shadows removed, see for details on data processing) is shown. Orange and blue signal correspond to summed 5’-mapped reads and 3’-mapped reads, respectively (rpm: reads per million). Top trace corresponds to wildtype, and bottom trace to Δrho. Horizontal size marker provides positional scale (200 bp) on each subpanel. Sense and antisense genes are shown in dark and light grey, respectively. Double line breaks (//) indicate truncated Rend-seq signal at peaks. Dashed lines mark regions for which fold-change in read density for Δrho/WT was estimated. The fold-change for each instance is indicated on the graph. c, Same as b, with representative examples of nested antisense RNAs with increased expression upon rho deletion (see Fig. 3b). Three nested antisense RNAs were found in E. coli with identical criteria. See Supplementary Data 3 for a list of nested antisense RNAs identified.
Extended Data Figure 7.
Extended Data Figure 7.. Expressed pseudogenes with interrupted translation in B. subtilis show no polarity.
Expressed pseudogenes endogenously present in the extant genome were used as additional independent experiments to assess the prevalence of Rho-mediated nonsense polarity in B. subtilis in situations of obligately uncoupled transcription and translation. Concomitant Rend-seq (mapping operon architecture) and ribosome profiling (measurement of translation) provides stringent data to determine translational status and transcript integrity of mRNAs. a, Schematic of analysis: for expressed pseudogenes (see Methods for selection criteria) with translation disruption, polarity was assessed by (1) comparing the mRNA read density at start and end of transcription unit, with large changes (start/end⨠1) indicative of polarity, and (2) fold change of pseudogene transcript upon rho deletion. Position of translation disrupting mutation is shown by ▲ and X. Dark and pale gray indicates region prior and after translation disruption mutation. b, Rend-seq and ribosome profiling data for the 8 identified expressed pseudogenes. Each subpanel corresponds to a pseudogene region. Top traces show Rend-seq data (orange and blue signal correspond to summed 5’-mapped reads and 3’-mapped reads, peak shadows removed, see for details on data processing). Orange peaks and blue peaks mark 5’ and 3’ boundaries of transcripts. Double line breaks (//) indicate truncated Rend-seq signal at peaks. Bottom traces show ribosome profiling data. Translation efficiency (ribosome profiling rpkm/Rend-seq rpkm) percentiles for each pseudogene sub-region (before and after translation disruption) are shown. Horizontal size marker provides positional scale (200 bp) on each subpanel. Nearby intact genes are shown in light blue. rpm: reads per million. Regions used to assess start to end decrease in RNA levels are marked by dashed lines. mRNA levels fold-changes (start/end, and Δrho/WT) are shown. The ydzW region showed a second translation disruption the secondary frame, shown as a pale ▲ and X. See Methods, Fig. 3b and Supplementary Data 3 for details.
Extended Data Figure 8.
Extended Data Figure 8.. Most expressed pseudogenes with interrupted translation in E. coli show polarity.
Similar to Extended Data Figure 7. Expressed pseudogenes endogenously present in the extant genome were used as additional independent experiments to assess the prevalence of Rho-mediated nonsense polarity in E. coli in situations of obligately uncoupled transcription and translation. Concomitant Rend-seq (mapping operon architecture) and ribosome profiling (monitoring translation) provides stringent data to determine translational status and transcript integrity on mRNAs. a, Schematic of analysis: for expressed pseudogenes (see Methods for selection criteria) with translation disruption, polarity was assessed by comparing the mRNA read density at start and end of transcription unit, with large changes (start/end⨠1) indicative of polarity. b, Rend-seq and ribosome profiling data for the identified expressed pseudogene with evidence of polarity. Each subpanel corresponds to a pseudogene region. Top traces correspond to Rend-seq data (orange and blue signal correspond to summed 5’-mapped reads and 3’-mapped reads, peak shadows removed, see for details on data processing). Orange peaks and blue peaks mark 5’ and 3’ boundaries of transcripts. Double line breaks (//) indicate truncated Rend-seq signal at peaks. Bottom traces show ribosome profiling data. Translation efficiency (ribosome profiling rpkm/Rend-seq rpkm) percentiles for each pseudogene sub-region (before and after translation disruption) are shown. Horizontal size marker provides positional scale (200 bp) on each subpanel. Light blue arrows correspond to nearby intact genes. rpm: reads per million. Regions used to assess start to end decrease in RNA levels are marked by dashed lines. mRNA levels fold-changes (start/end) are shown. The gapC region showed sequential translation disruptions secondary frames, shown as a pale ▲ and X. c, same as b, but for the two cases with no evidence of polarity. The translation disruptions mutation in ykiA and cybC are deletion of the beginning of ORFs. See Methods, Fig. 3b and Supplementary Data 3 for details.
Extended Data Figure 9.
Extended Data Figure 9.. Analysis of C-to-G ratio for putative Rho-terminated RNAs.
a, Cumulative distributions of maximum C-to-G ratio (“Max C:G”) of 100 nt sliding windows within non Rho-terminated coding sequences (CDSs, blue, n=2625) and Rho-terminated CDSs (magenta, n=10). Median of Max C:G is higher for Rho-terminated CDSs (magenta) than non Rho-terminated CDSs (blue) (p<10−5, less than one in 105 random sub-samplings (n=10) of non Rho-terminated distribution had higher median maximum C-to-G ratio). b, Cumulative distributions as in a for asRNAs that are not terminated by Rho (blue, n=112) and asRNAs that are terminated by Rho (magenta, n=91). Median of Max C:G is higher for Rho-terminated asRNAs than non Rho-terminated asRNAs (p<10−3, less than one in 103 random sub-samplings (n=10) of non Rho-terminated distribution had higher median maximum C to G ratio compared to sub-sampling (n=10) of Rho-terminated distribution ). See Methods.
Extended Data Figure 10.
Extended Data Figure 10.. Illustration of terminator identification pipeline and analysis of stem-to-stop distribution stratified by phyla.
The terminator identification pipeline selects for strong hairpins immediately upstream of long U-tract found downstream of genes. Thresholds on hairpin folding free energy are determined on a species-by-species basis based on properties of randomly selected regions in respective genomes. The case of V. choloerae is illustrated in a-c. a, Results of folding 104 regions of 40 nt chosen at random positions in the genome. Left panel shows the 2D distribution as a heatmap (dark positions corresponding to more density) of hairpin geometrical parameters (number of base pairs in stem Nbp, length of loop). Geometric thresholds are highlighted with blue dashes (5 bp ≤ Nbp ≤ 15 bp, 3 nt ≤ Loop ≤ 8 nt) and retained region by blue shading. Right panel shows the 2D distribution as a heatmap (dark positions correspond to more density) of hairpin free energy of folding ΔGhairpin and fraction of bases paired in stem f. Thresholds ΔG1 and ΔG2 on ΔGhairpin are chosen such the total fraction of hairpin from random regions meeting geometrical (blue shading in left panel) and thermodynamic thresholds are 1% (orange, ΔGhairpin ≤ΔG1 and f≥0.95) and 1.5% (red, ΔGhairpin ≤ΔG2 and f≥0.9). b, Similar as for a, but for regions seeded by U-tracts (stretch of 5 or more consecutive T’s in the genome downstream of genes). Note the excess density of hairpins with strong energy of folding and large fraction of bases paired, corresponding to putative intrinsic terminators. c, Distribution of stop-to-stem distances for terminators passing thresholds shown in b. See Supplementary Data 2, Supplementary Data 3, and Methods for details of computational pipeline. d and e, Phylum stratified analysis on the stop-to-stem distribution. d, Each subpanel shows as a 2D greyscale the fraction of species within each phylum (shown in Fig. 4) for which more than fraction F (y-axis) of terminators have stop-to-stem distances less than or equal to D (x-axis). Black regions correspond to no species in the phylum, white all species. The contour line in the (D,F) space marks points where 50% of species in the phylum have fraction ≥F of their terminators with stop-to-stem distance ≤D. The yellow stars mark the thresholds used in Fig. 4 (D=12 nt, F=30%). For example, about 50% of species analyzed in the Firmicutes have more than 30% of their terminators within 12 nt of upstream ORF (red contour line intersecting yellow star). e, The 50% species contour lines from d reported to the same panel, showing clear separation between phyla.
Fig. 1.
Fig. 1.. Fast RNAP movement results in runaway transcription.
a, Schematic of inducible lacZ expression system in B. subtilis. Region probed by qRT-PCR is labeled in magenta. b, Induction time courses of full-length lacZ mRNA (top) and protein (bottom), measured by qRT-PCR and beta-galactosidase assays, respectively. After the first appearance times (τTX and τTL), mRNAs accumulate linearly with time whereas proteins accumulate quadratically with time. Lines indicate linear fits of transformed data after signals rise. Shaded regions indicate time difference between τTX and τTL. Uncertainties are standard error of the mean (SEM) among biological replicates (3 for B. subtilis beta-galactosidase assay, 2 for all others). c, Schematic of lacZα complementation reporter for endogenous genes. Endogenous 5’ UTR and gene are indicated in green. Pxyl: xylose promoter. d, Same as b, but for endogenous genes. Translation efficiencies for pycA and tkt are 50th and 93rd percentiles among B. subtilis genes, respectively. Three biological replicates for pycA-lacZα beta-galactosidase assay, 2 for all others. e, Table of first appearance times for truncated pycA constructs. Uncertainties are standard error of the mean (SEM) among biological replicates (2). f, Plot showing estimated terminal ribosome-RNAP distance as a function of gene length (bottom). Elongation rates are based on e, and translation initiation times are assumed to be negligible. Histogram shows distribution of gene lengths in B. subtilis (top). See also Extended Data Fig. 1-3.
Fig. 2:
Fig. 2:. Lack of translational control on transcription.
a, Schematics of ORF-extension construct and controls for pupG (80th percentile in translation efficiency). T1: pupG terminator (99.97% termination efficiency), T2: sodA terminator (99.9% termination efficiency). Stars indicate mutations. The stop-to-stem distances d for the native and extended constructs are 26 nt and minus 14 nt respectively. b-c, Northern blots against readthrough isoforms (top) and control for pupG expression (bottom) for constructs indicated in a. N.D.: not detected. For gel source data, see Supplementary Figure 1. Northern blotting was performed twice for B. subtilis (biological replicates) and once for E. coli. Results for both species were independently confirmed (biological replicates) by qRT-PCR (Methods). d, Examples of terminator stem-loops overlapping with stop codons (patA d=−12 nt, ispG d=−5 nt). Peaks in Rend-seq data show sites of termination. Terminator stems are highlighted. Stop codons are indicated in red. Translation efficiencies for patA and ispG are 63rd and 90th percentiles, respectively, in B. subtilis. e, Genome-wide distribution of stop-to-stem distances d (see inset) for high-confidence intrinsic terminators in B. subtilis (top, n=1228) and E. coli (bottom, n=409). ORF-overlapping terminators (d≤0) are in dark magenta, and ribosome-overlapping terminators (d≤12 nt) are in medium and dark magenta, with respective fraction of terminators indicated. See also Extended Data Fig. 5, Supplementary Data 2.
Fig. 3.
Fig. 3.. Signals of Rho-dependent termination.
a, Quantification of mRNA levels with and without premature stop codons (‘x’). mRNA levels are quantified by qRT-PCR for the lacZα region (blue) relative to gyrA. Comparison of mRNA levels between cells without Rho (green) and WT (magenta) are shown. b, Distributions of mRNA level changes between two WT replicates (magenta) and between WT and Δrho (green) as measured by Rend-seq. Expression changes for asRNAs nested within operons (Extended Data Fig. 6) and pseudogenes (Extended Data Fig. 7 and 8) are indicated below. n: number of cases. Box plots are defined by median, 25th and 75th percentiles. c, Example of a Rho-terminated asRNA (cssSAS). Rend-seq data in WT and Δrho show regions of potential termination sites (orange: 5’-end mapped reads, blue: 3’-end mapped reads). d, Quantification of mRNA levels with variants of cssSAS insertions (with 7 mutations to replace in-frame stop codons with sense codons, see Supplementary Data 1 for sequence). Relative mRNA expression measured as in a. e, Quantification of C-to-G ratios (number of C residues divided by number of G residues) in 100-nt moving windows of cssSAS. See also Extended Data Fig. 6-9, Supplementary Data 3.
Fig. 4.
Fig. 4.. Phylogenomic distribution of uncoupling.
a, Phylogenetic tree (center) is overlaid with grayscale heatmap representation of the distributions of stop-to-stem distances d for each species (middle ring, range in d shown from −20 to 120 nt). Full and dashed lines mark d=0 nt and d=12 nt respectively. The species-specific fractions F of high-confidence terminators with d≤12 nt is shown in the outer ring. Number of species per phylum with at least 30% of terminators with d≤12 nt is indicated under the phylum name. Species without Rho homologs are marked with lines next to the tree (grey: no homolog, red: partial homolog or pseudogene). The 1434 representative or reference genomes (with n≥20 identified terminators) from RefSeq are included. Tandem terminators are excluded. See Extended Data Fig. 10, Supplementary Data 4-5, Methods. Insets (L. monocytogenes, n=705 identified terminators, F=71.2% of terminators with d≤12 nt; P. aeruginosa, n = 216, F=6.9%) show representative examples of bioinformatically determined stop-to-stem distributions (c.f., Fig. 2e) with their heatmap representation (above) shown in middle ring. Dark and light portions of the histograms in insets highlight terminators with d≤12 nt and d>12 nt respectively. b and c, Schematics of transcription-coupled and runaway transcriptions and some of their respective functional consequences.

Comment in

Similar articles

Cited by

References

Main references:

    1. Adhya S & Gottesman M Control of Transcription Termination. Annu. Rev. Biochem 47, 967–996 (1978). - PubMed
    1. Richardson JP Preventing the synthesis of unused transcripts by rho factor. Cell 64, 1047–1049 (1991). - PubMed
    1. Landick R, Carey J & Yanofsky C Translation activates the paused transcription complex and restores transcription of the trp operon leader region. Proc. Natl. Acad. Sci. USA 82, 4663–4667 (1985). - PMC - PubMed
    1. Proshkin S, Rahmouni AR, Mironov A & Nudler E Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science 328, 504–508 (2010). - PMC - PubMed
    1. Burmann BMB et al. A NusE: NusG complex links transcription and translation. Science 328, 501–504 (2010). - PubMed

Methods References

    1. Harwood CR and Cutting SM Molecular Biological methods for Bacillus. Molecular Biological Methods for Bacillus (John Wiley, 1990).
    1. Li G-W, Burkhardt D, Gross C & Weissman JS Quantifying Absolute Protein Synthesis Rates Reveals Principles Underlying Allocation of Cellular Resources. Cell 157, 624–635 (2014). - PMC - PubMed
    1. DeLoughery A, Lalanne J-B, Losick R & Li G-W Maturation of polycistronic mRNAs by the endoribonuclease RNase Y and its associated Y-complex in Bacillus subtilis. Proc. Natl. Acad. Sci. USA 115, E5585–E5594 (2018). - PMC - PubMed
    1. Zhu M, Dai X & Wang Y-P Real time determination of bacterial in vivo ribosome translation elongation speed based on LacZα complementation system. Nucleic Acids Res. 44, gkw698 (2016). - PMC - PubMed
    1. Bonekamp F, Clemmesen K, Karlstrom O & Jensen KF Mechanism of UTP-modulated attenuation at the pyrE gene of Escherichia coli: an example of operon polarity control through the coupling of translation to transcription. Embo J 3, 2857–2861 (1984). - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources