RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis

Curr Protoc Hum Genet. 2014 Oct 1;83:11.13.1-20. doi: 10.1002/0471142905.hg1113s83.

Abstract

RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two-fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. We hope that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development.

Keywords: RNA-seq experimental design; biological replicates; differential expression; paired-end sequencing; sequence length; sequencing depth; splice-aware alignment; transcript abundance.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Polymerase Chain Reaction
  • Quality Control
  • RNA Splicing
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA*

Substances

  • RNA, Messenger