RNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering

PLoS One. 2012;7(12):e50986. doi: 10.1371/journal.pone.0050986. Epub 2012 Dec 10.


With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter's disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cluster Analysis
  • Databases, Genetic
  • Drosophila melanogaster
  • Gene Expression Profiling / methods
  • Gene Expression*
  • Oligonucleotide Array Sequence Analysis / methods*
  • RNA / genetics*
  • RNA / metabolism
  • Sequence Analysis, RNA / methods*


  • RNA

Grants and funding

Most of this work has been conducted with support from the Irish Research Council for Science Engineering and Technology EMBARK Initiative. The authors also thank the EveryAware project funded by the Future and Emerging Technologies program (IST-FET) of the European Commission under the EU RD contract IST-265432. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.