miRNA normalization enables joint analysis of several datasets to increase sensitivity and to reveal novel miRNAs differentially expressed in breast cancer

PLoS Comput Biol. 2021 Feb 10;17(2):e1008608. doi: 10.1371/journal.pcbi.1008608. eCollection 2021 Feb.


Different miRNA profiling protocols and technologies introduce differences in the resulting quantitative expression profiles. These include differences in the presence (and measurability) of certain miRNAs. We present and examine a method based on quantile normalization, Adjusted Quantile Normalization (AQuN), to combine miRNA expression data from multiple studies in breast cancer into a single joint dataset for integrative analysis. By pooling multiple datasets, we obtain increased statistical power, surfacing patterns that do not emerge as statistically significant when separately analyzing these datasets. To merge several datasets, as we do here, one needs to overcome both technical and batch differences between these datasets. We compare several approaches for merging and jointly analyzing miRNA datasets. We investigate the statistical confidence for known results and highlight potential new findings that resulted from the joint analysis using AQuN. In particular, we detect several miRNAs to be differentially expressed in estrogen receptor (ER) positive versus ER negative samples. In addition, we identify new potential biomarkers and therapeutic targets for both clinical groups. As a specific example, using the AQuN-derived dataset we detect hsa-miR-193b-5p to have a statistically significant over-expression in the ER positive group, a phenomenon that was not previously reported. Furthermore, as demonstrated by functional assays in breast cancer cell lines, overexpression of hsa-miR-193b-5p in breast cancer cell lines resulted in decreased cell viability in addition to inducing apoptosis. Together, these observations suggest a novel functional role for this miRNA in breast cancer. Packages implementing AQuN are provided for Python and Matlab: https://github.com/YakhiniGroup/PyAQN.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers / metabolism
  • Biomarkers, Tumor / genetics
  • Breast Neoplasms / genetics*
  • Breast Neoplasms / metabolism*
  • Cell Line, Tumor
  • Computer Simulation
  • Estrogen Receptor alpha / metabolism
  • Female
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • MCF-7 Cells
  • MicroRNAs / metabolism*
  • Oligonucleotide Array Sequence Analysis
  • Programming Languages
  • RNA, Messenger / genetics


  • Biomarkers
  • Biomarkers, Tumor
  • ESR1 protein, human
  • Estrogen Receptor alpha
  • MIRN193 microRNA, human
  • MicroRNAs
  • RNA, Messenger

Grant support

Helse Vest (https://helse-vest.no/en) grant 911450 was received by KJ. This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme (https://ec.europa.eu/programmes/horizon2020/en) under Grant agreement No. 847912. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.