Genome-wide identification of transcript start and end sites by transcript isoform sequencing

Nat Protoc. 2014 Jul;9(7):1740-59. doi: 10.1038/nprot.2014.121. Epub 2014 Jun 26.

Abstract

Hundreds of transcript isoforms with varying boundaries and alternative regulatory signals are transcribed from the genome, even in a genetically homogeneous population of cells. To study this transcriptional heterogeneity, we developed transcript isoform sequencing (TIF-seq), a method that allows the genome-wide profiling of full-length transcript isoforms defined by their exact 5' and 3' boundaries. TIF-seq entails the generation of full-length cDNA libraries, followed by their circularization and the sequencing of the junction fragments spanning the 5' and 3' transcript ends. By determining the respective co-occurrence of start and end sites of individual transcript molecules, TIF-seq can distinguish variations that conventional approaches for mapping single ends cannot, such as short abortive transcripts, bicistronic messages and overlapping transcripts that differ in lengths. The TIF-seq protocol we describe here can be applied to any eukaryotic organism (e.g., yeast, human), and it requires 6-10 d for generating TIF-seq libraries, 10 d for sequencing and 2-3 d for analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genetic Variation
  • Genome
  • Genomics
  • Protein Isoforms / chemistry*
  • Protein Isoforms / genetics
  • RNA, Messenger / chemistry*
  • RNA, Messenger / genetics
  • Saccharomyces cerevisiae / genetics*
  • Sequence Analysis, RNA / methods*

Substances

  • Protein Isoforms
  • RNA, Messenger