Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 16;15(1):684.
doi: 10.1186/1471-2164-15-684.

Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis

Affiliations

Primary transcriptome map of the hyperthermophilic archaeon Thermococcus kodakarensis

Dominik Jäger et al. BMC Genomics. .

Abstract

Background: Prokaryotes have relatively small genomes, densely-packed with protein-encoding sequences. RNA sequencing has, however, revealed surprisingly complex transcriptomes and here we report the transcripts present in the model hyperthermophilic Archaeon, Thermococcus kodakarensis, under different physiological conditions.

Results: Sequencing cDNA libraries, generated from RNA isolated from cells under different growth and metabolic conditions has identified >2,700 sites of transcription initiation, established a genome-wide map of transcripts, and consensus sequences for transcription initiation and post-transcription regulatory elements. The primary transcription start sites (TSS) upstream of 1,254 annotated genes, plus 644 primary TSS and their promoters within genes, are identified. Most mRNAs have a 5'-untranslated region (5'-UTR) 10 to 50 nt long (median = 16 nt), but ~20% have 5'-UTRs from 50 to 300 nt long and ~14% are leaderless. Approximately 50% of mRNAs contain a consensus ribosome binding sequence. The results identify TSS for 1,018 antisense transcripts, most with sequences complementary to either the 5'- or 3'-region of a sense mRNA, and confirm the presence of transcripts from all three CRISPR loci, the RNase P and 7S RNAs, all tRNAs and rRNAs and 69 predicted snoRNAs. Two putative riboswitch RNAs were present in growing but not in stationary phase cells. The procedure used is designed to identify TSS but, assuming that the number of cDNA reads correlates with transcript abundance, the results also provide a semi-quantitative documentation of the differences in T. kodakarensis genome expression under different growth conditions and confirm previous observations of substrate-dependent specific gene expression. Many previously unanticipated small RNAs have been identified, some with relative low GC contents (≤ 50%) and sequences that do not fold readily into base-paired secondary structures, contrary to the classical expectations for non-coding RNAs in a hyperthermophile.

Conclusion: The results identify >2,700 TSS, including almost all of the primary sites of transcription initiation upstream of annotated genes, plus many secondary sites, sites within genes and sites resulting in antisense transcripts. The T. kodakarensis genome is small (~2.1 Mbp) and tightly packed with protein-encoding genes, but the transcriptomes established also contain many non-coding RNAs and predict extensive RNA-based regulation in this model Archaeon.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Transcription start site (TSS) classification. (A) Diagram showing all potential locations of TSS. In this report, they are categorized as primary (pTSS; most abundant transcript based on the highest number of cDNA reads) or secondary TSS (sTSS; all other transcripts) when located ≤300 bp upstream of a gene with transcription on the sense strand of that gene. They are categorized as internal TSS (iTSS) when located within an annotated gene and antisense TSS (aTSS) when located either in, or within ≤100 bp of a gene, on the antisense strand. A TSS not readily assigned to any of these categories was categorized as an orphan TSS (oTSS). (B) The distribution and overlap of the TSSs identified in each category. (C) The distribution of lengths of 5′-UTRs based on the combined number of pTSS and sTSS that initiate transcription at each position, with 0 being the first bp of the translation start codon. The numbers of TSS resulting in 5'-UTRs ≤8 nt, and so considered leaderless mRNAs, are indicated by the grey bars. The insert shows the consensus of the RBS identified in T. kodakarensis.
Figure 2
Figure 2
Consensus promoter motifs. Each frame shows the best ranking sequence motif identified by a MEME search [45] performed on a window from -50 bp upstream to the TSS for (A) pTSS of annotated ORFs, (B) sTSS of annotated ORFs, (C) iTSS and (D) aTSS. An orphan transcript was so designated when there was no detectable association of a TSS with an annotated gene, and (E) shows the consensus promoter motifs for orphan transcripts. The negative numbers on the x-axis indicate the distance in bp upstream from the identified TSS.
Figure 3
Figure 3
Location and expression of potential regulatory RNAs. Transcript abundances, based on cDNA reads, from regions proposed to function as (A) a fluoride-sensing riboswitch upstream of TK0513 [51] and (B) a pre-Q1 riboswitch designated sRk28. (C) Transcripts of sscA [24, 57] present in the 5'-UTR of TK0308. (D) Transcripts of tRNALys (Tkt03) present within the 5'-UTR of TK0306. An antisense RNA is also transcribed from the TK0306 region. The numbers of cDNA reads from transcripts present in T. kodakarensis cells growing exponentially with sulfur (Sexp; blue) and in stationary phase in sulfur medium (Sstat; red), growing exponentially in pyruvate medium before (Pexp; green) and 20 min after sulfur addition (PS; orange) are given by the peak heights. Data from the control library (C; exponential phase with sulfur) not digested with TEX are shown in grey. The relative abundance scales on the right of each panel allow direct comparisons of all data in that panel. The black scale bar in the top right corner of each panel is corresponds to 100 nt.
Figure 4
Figure 4
Internal transcription start sites. The genome organizations surrounding (A) rpoL (TK1169) and (B) rpoN (TK1499). The promoter motifs for the pTSS of TK1169 and TKt26 and for the iTSS identified for rpoL and rpoN are shown below the panels. The abundances of transcripts present in T. kodakarensis cells growing exponentially (Sexp; blue) and in stationary phase (Sstat; red) in sulfur medium, growing exponentially in pyruvate medium before (Pexp; green) and 20 min after sulfur addition (PS; orange) are given by the peak heights. Data from the control library (C) not digested with TEX are shown in grey. The position of the sequence near the 3′-terminus of rpoL that, when transcribed, is predicted to form a stable RNA hairpin structure is indicated (see also Additional file 7: Figure S3). The black scale bar in the top right corner of each panel is corresponds to 100 nt.
Figure 5
Figure 5
Transcription of TK1361 (MCM2) and TK1620 (MCM3). (A) As annotated in the T. kodakarensis genome [38], TK1361 has an atypical 5'-extension, here shown by broken lines. The location of the TSS identified for TK1361 and the sequence downstream that could function as a translation start site are shown. With the TSS re-categorized from iTSS to pTSS and translation initiated at the boxed GTG codon, the MCM2 generated has a standard MCM structure. (B) The organization of the TK1619-TK1620 (MCM3) region. The locations and upstream promoter sequences for the pTSS and an iTSS within TK1620 are indicated. A putative GTG translation initiation codon downstream of the iTSS is boxed. In both panels the protein domains, identified in the NCBI Conserved Domain Database [58], are shown in dark grey. The abundances of transcripts present in T. kodakarensis cells growing exponentially (Sexp; blue) and in stationary phase (Sstat; red) in sulfur medium, growing exponentially in pyruvate medium before (Pexp; green) and 20 min after sulfur addition (PS; orange) are given by the peak heights. Data from the control library (C) not digested with TEX are shown in grey. The relative abundance scales on the right of each panel allow direct comparisons of all data in that panel. The black scale bar in the top right corner of each panel is corresponds to 100 nt.
Figure 6
Figure 6
Heatmap comparison of changes in gene expression after sulfur addition. Changes in transcript abundance are shown, on a log2-fold scale, for the TK genes listed. Comparisons were made of the cDNA libraries generated from RNA isolated before and 20 min after addition of sulfur to T. kodakarensis cells growing exponentially in pyruvate. Values were calculated based on changes in the abundance of cDNA reads, as described in Materials and Methods. The T. kodakarensis data are aligned and compared with the microarray hybridization results reported for sulfur-induced changes in transcription of the homologous mbx and mbh operons, and related homologous genes, in P. furiosus[33].

Similar articles

Cited by

References

    1. Cavicchioli R. Archaea–timeline of the third domain. Nat Rev Microbiol. 2011;9:51–61. doi: 10.1038/nrmicro2482. - DOI - PubMed
    1. Sato T, Fukui T, Atomi H, Imanaka T. Targeted gene disruption by homologous recombination in the hyperthermophilic archaeon thermococcus kodakarensis KOD1. J Bacteriol. 2003;185:210–220. doi: 10.1128/JB.185.1.210-220.2003. - DOI - PMC - PubMed
    1. Farkas JA, Picking JW, Santangelo TJ. Genetic techniques for the archaea. Annu Rev Genet. 2013;47:539–561. doi: 10.1146/annurev-genet-111212-133225. - DOI - PubMed
    1. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. doi: 10.1038/nrg2484. - DOI - PMC - PubMed
    1. Croucher NJ, Thomson NR. Studying bacterial transcriptomes using RNA-seq. Curr Opin Microbiol. 2010;13:619–624. doi: 10.1016/j.mib.2010.09.009. - DOI - PMC - PubMed

Publication types

MeSH terms