Molecular Profiling of RNA Tumors Using High-Throughput RNA Sequencing: From Raw Data to Systems Level Analyses

Methods Mol Biol. 2019;1908:185-204. doi: 10.1007/978-1-4939-9004-7_13.

Abstract

RNAseq is a powerful technique enabling global profiles of transcriptomes in healthy and diseased states. In this chapter we review pipelines to analyze the data generated by sequencing RNA, from raw data to a system level analysis. We first give an overview of workflow to generate mapped reads from FASTQ files, including quality control of FASTQ, filtering and trimming of reads, and alignment of reads to a genome. Then, we compare and contrast three popular options to determine differentially expressed (DE) transcripts (The Tuxedo Pipeline, DESeq2, and Limma/voom). Finally, we examine four tool sets to extrapolate biological meaning from the list of DE genes (Genecards, The Human Protein Atlas, GSEA, and ToppGene). We emphasize the need to ask a concise scientific question and to clearly under stand the strengths and limitations of the methods.

Keywords: DESeq 2; FASTQ; Gene Set Enrichment Analysis/GSEA; HTSeq; High-throughput sequencing (HTS); Limma/Voom; RNAsequencing (RNAseq); TOPPGENE; Tuxedo pipeline.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Molecular Sequence Annotation / methods
  • Quality Control
  • Sequence Analysis, RNA / methods
  • Software*
  • Workflow