RNA sequencing and quantitation using the Helicos Genetic Analysis System

Tal Raz; Marie Causey; Daniel R Jones; Alix Kieu; Stan Letovsky; Doron Lipson; Edward Thayer; John F Thompson; Patrice M Milos

doi:10.1007/978-1-61779-089-8_3

RNA sequencing and quantitation using the Helicos Genetic Analysis System

Methods Mol Biol. 2011:733:37-49. doi: 10.1007/978-1-61779-089-8_3.

Authors

Tal Raz¹, Marie Causey, Daniel R Jones, Alix Kieu, Stan Letovsky, Doron Lipson, Edward Thayer, John F Thompson, Patrice M Milos

Affiliation

¹ Helicos BioSciences Corporation, Cambridge, MA, USA. traz@helicosbio.com

PMID: 21431761
DOI: 10.1007/978-1-61779-089-8_3

Abstract

The recent transition in gene expression analysis technology to ultra high-throughput cDNA sequencing provides a means for higher quantitation sensitivity across a wider dynamic range than previously possible. Sensitivity of detection is mostly a function of the sheer number of sequence reads generated. Typically, RNA is converted to cDNA using random hexamers and the cDNA is subsequently sequenced (RNA-Seq). With this approach, higher read numbers are generated for long transcripts as compared to short ones. This length bias necessitates the generation of very high read numbers to achieve sensitive quantitation of short, low-expressed genes. To eliminate this length bias, we have developed an ultra high-throughput sequencing approach where only a single read is generated for each transcript molecule (single-molecule sequencing Digital Gene Expression (smsDGE)). So, for example, equivalent quantitation accuracy of the yeast transcriptome can be achieved by smsDGE using only 25% of the reads that would be required using RNA-Seq. For sample preparation, RNA is first reverse-transcribed into single-stranded cDNA using oligo-dT as a primer. A poly-A tail is then added to the 3' ends of cDNA to facilitate the hybridization of the sample to the Helicos(®) single-molecule sequencing Flow-Cell to which a poly dT oligo serves as the substrate for subsequent sequencing by synthesis. No PCR, sample-size selection, or ligation steps are required, thus avoiding possible biases that may be introduced by such manipulations. Each tailed cDNA sample is injected into one of 50 flow-cell channels and sequenced on the Helicos(®) Genetic Analysis System. Thus, 50 samples are sequenced simultaneously generating 10-20 million sequence reads on average for each sample channel. The sequence reads can then be aligned to the reference of choice such as the transcriptome, for quantitation of known transcripts, or the genome for novel transcript discovery. This chapter provides a summary of the methods required for smsDGE.

MeSH terms

DNA Primers / genetics
DNA Primers / metabolism
DNA, Complementary / biosynthesis
DNA, Complementary / metabolism
DNA, Single-Stranded / biosynthesis
DNA, Single-Stranded / metabolism
Deoxyribonucleases / metabolism
High-Throughput Nucleotide Sequencing / methods*
Nucleic Acid Hybridization
Poly A / metabolism
Polyadenylation
RNA / genetics*
RNA / metabolism
RNA, Fungal / genetics
RNA, Fungal / metabolism
RNA, Messenger / genetics
RNA, Messenger / metabolism
Sequence Analysis, DNA / methods*
Sequence Analysis, RNA / methods

Substances

DNA Primers
DNA, Complementary
DNA, Single-Stranded
RNA, Fungal
RNA, Messenger
Poly A
RNA
Deoxyribonucleases