Simultaneous mapping of transcript ends at single-nucleotide resolution and identification of widespread promoter-associated non-coding RNA governed by TATA elements

Nucleic Acids Res. 2014 Apr;42(6):3736-49. doi: 10.1093/nar/gkt1366. Epub 2014 Jan 10.

Abstract

Understanding the relationships between regulatory factor binding, chromatin structure, cis-regulatory elements and RNA-regulation mechanisms relies on accurate information about transcription start sites (TSS) and polyadenylation sites (PAS). Although several approaches have identified transcript ends in yeast, limitations of resolution and coverage have remained, and definitive identification of TSS and PAS with single-nucleotide resolution has not yet been achieved. We developed SMORE-seq (simultaneous mapping of RNA ends by sequencing) and used it to simultaneously identify the strongest TSS for 5207 (90%) genes and PAS for 5277 (91%) genes. The new transcript annotations identified by SMORE-seq showed improved distance relationships with TATA-like regulatory elements, nucleosome positions and active RNA polymerase. We found 150 genes whose TSS were downstream of the annotated start codon, and additional analysis of evolutionary conservation and ribosome footprinting suggests that these protein-coding sequences are likely to be mis-annotated. SMORE-seq detected short non-coding RNAs transcribed divergently from more than a thousand promoters in wild-type cells under normal conditions. These divergent non-coding RNAs were less evident at promoters containing canonical TATA boxes, suggesting a model where transcription initiation at promoters by RNAPII is bidirectional, with TATA elements serving to constrain the directionality of initiation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Codon, Initiator
  • Molecular Sequence Annotation
  • Nucleotides / analysis
  • Polyadenylation
  • Promoter Regions, Genetic
  • RNA Caps / chemistry
  • RNA, Untranslated / biosynthesis*
  • Saccharomyces cerevisiae / genetics
  • Sequence Analysis, RNA
  • TATA Box*
  • Transcription Initiation Site
  • Transcription Initiation, Genetic*

Substances

  • Codon, Initiator
  • Nucleotides
  • RNA Caps
  • RNA, Untranslated

Associated data

  • GEO/GSE49026
  • GEO/GSE52355