Two decades after the discovery of the first animal microRNA (miRNA), the number of miRNAs in animal genomes remains a vexing question. Here, we report findings from analyzing 1,323 short RNA sequencing samples (RNA-seq) from 13 different human tissue types. Using stringent thresholding criteria, we identified 3,707 statistically significant novel mature miRNAs at a false discovery rate of ≤ 0.05 arising from 3,494 novel precursors; 91.5% of these novel miRNAs were identified independently in 10 or more of the processed samples. Analysis of these novel miRNAs revealed tissue-specific dependencies and a commensurate low Jaccard similarity index in intertissue comparisons. Of these novel miRNAs, 1,657 (45%) were identified in 43 datasets that were generated by cross-linking followed by Argonaute immunoprecipitation and sequencing (Ago CLIP-seq) and represented 3 of the 13 tissues, indicating that these miRNAs are active in the RNA interference pathway. Moreover, experimental investigation through stem-loop PCR of a random collection of newly discovered miRNAs in 12 cell lines representing 5 tissues confirmed their presence and tissue dependence. Among the newly identified miRNAs are many novel miRNA clusters, new members of known miRNA clusters, previously unreported products from uncharacterized arms of miRNA precursors, and previously unrecognized paralogues of functionally important miRNA families (e.g., miR-15/107). Examination of the sequence conservation across vertebrate and invertebrate organisms showed 56.7% of the newly discovered miRNAs to be human-specific whereas the majority (94.4%) are primate lineage-specific. Our findings suggest that the repertoire of human miRNAs is far more extensive than currently represented by public repositories and that there is a significant number of lineage- and/or tissue-specific miRNAs that are uncharacterized.
Keywords: RNA sequencing; isomIRs; microRNAs; noncoding RNA; transcriptome.