A comparative study of techniques for differential expression analysis on RNA-Seq data

PLoS One. 2014 Aug 13;9(8):e103207. doi: 10.1371/journal.pone.0103207. eCollection 2014.

Abstract

Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq) to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs) between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Benchmarking
  • Cell Line
  • DNA, Complementary / chemistry
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Male
  • Mice, Inbred C57BL
  • RNA / chemistry
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Analysis, RNA / methods
  • Software*

Substances

  • DNA, Complementary
  • RNA

Grants and funding

This work was supported by a National Health and Medical Research Council Program Grant to PFB, an Australian Research Council (FT FT0991360) and NHMRC (613602) fellowships to NRW and the Estate of Dr. Clem Jones AO. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.