Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis

PLoS One. 2012;7(2):e31630. doi: 10.1371/journal.pone.0031630. Epub 2012 Feb 20.

Abstract

Background: Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated.

Primary findings: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression.

Conclusions: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Genetic
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Genome, Human / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • MicroRNAs / genetics*
  • MicroRNAs / metabolism
  • ROC Curve
  • Reference Standards
  • Sample Size
  • Sequence Alignment
  • Software
  • Workflow*

Substances

  • MicroRNAs