Novel rank-based statistical methods reveal microRNAs with differential expression in multiple cancer types

PLoS One. 2009 Nov 25;4(11):e8003. doi: 10.1371/journal.pone.0008003.


Background: MicroRNAs (miRNAs) regulate target genes at the post-transcriptional level and play important roles in cancer pathogenesis and development. Variation amongst individuals is a significant confounding factor in miRNA (or other) expression studies. The true character of biologically or clinically meaningful differential expression can be obscured by inter-patient variation. In this study we aim to identify miRNAs with consistent differential expression in multiple tumor types using a novel data analysis approach.

Methods: Using microarrays we profiled the expression of more than 700 miRNAs in 28 matched tumor/normal samples from 8 different tumor types (breast, colon, liver, lung, lymphoma, ovary, prostate and testis). This set is unique in putting emphasis on minimizing tissue type and patient related variability using normal and tumor samples from the same patient. We develop scores for comparing miRNA expression in the above matched sample data based on a rigorous characterization of the distribution of order statistics over a discrete state set, including exact p-values. Specifically, we compute a Rank Consistency Score (RCoS) for every miRNA measured in our data. Our methods are also applicable in various other contexts. We compare our methods, as applied to matched samples, to paired t-test and to the Wilcoxon Signed Rank test.

Results: We identify consistent (across the cancer types measured) differentially expressed miRNAs. 41 miRNAs are under-expressed in cancer compared to normal, at FDR (False Discovery Rate) of 0.05 and 17 are over-expressed at the same FDR level. Differentially expressed miRNAs include known oncomiRs (e.g miR-96) as well as miRNAs that were not previously universally associated with cancer. Specific examples include miR-133b and miR-486-5p, which are consistently down regulated and mir-629* which is consistently up regulated in cancer, in the context of our cohort. Data is available in GEO. Software is available at:

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cell Line, Tumor
  • Computational Biology / methods
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • MicroRNAs / metabolism*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis
  • RNA Processing, Post-Transcriptional
  • Software
  • Tissue Distribution


  • MicroRNAs