Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Aug 24;13(8):e0202369.
doi: 10.1371/journal.pone.0202369. eCollection 2018.

Genome-wide Identification of Clusters of Predicted microRNA Binding Sites as microRNA Sponge Candidates

Free PMC article

Genome-wide Identification of Clusters of Predicted microRNA Binding Sites as microRNA Sponge Candidates

Xiaoyong Pan et al. PLoS One. .
Free PMC article


The number of discovered natural miRNA sponges in plants, viruses, and mammals is increasing steadily. Some sponges like ciRS-7 for miR-7 contain multiple nearby miRNA binding sites. We hypothesize that such clusters of miRNA binding sites on the genome can function together as a sponge. No systematic effort has been made in search for clusters of miRNA targets. Here, we, to our knowledge, make the first genome-wide target site predictions for clusters of mature human miRNAs. For each miRNA, we predict the target sites on a genome-wide scale, build a graph with edge weights based on the pairwise distances between sites, and apply Markov clustering to identify genomic regions with high binding site density. Significant clusters are then extracted based on cluster size difference between real and shuffled genomes preserving local properties such as the GC content. We then use conservation and binding energy to filter a final set of miRNA target site clusters or sponge candidates. Our pipeline predicts 3673 sponge candidates for 1250 miRNAs, including the experimentally verified miR-7 sponge ciRS-7. In addition, we point explicitly to 19 high-confidence candidates overlapping annotated genomic sequence. The full list of candidates is freely available at, where detailed properties for individual candidates can be explored, such as alignment details, conservation, accessibility and target profiles, which facilitates selection of sponge candidates for further context specific analysis.

Conflict of interest statement

The authors have declared that no competing interests exist.


Fig 1
Fig 1. Flowchart of the analysis pipeline.
For each mature miRNA in miRBase v20, we ran RIsearch2 against both the real repeat-masked genome and a shuffled version to predict binding sites. We then used the Markov Cluster (MCL) algorithm to identify genomic clusters of binding sites and identified statistically significant clusters by comparing the results for the real and shuffled genomes. Finally, the significant clusters were further filtered by conservation and binding energy.
Fig 2
Fig 2. Cluster size distribution for predicted miR-7 binding sites.
The plots show the size distributions of the clusters obtained for the real and shuffled genomes when running MCL clustering with an inflation factor of 2.0 on the miR-7 binding sites predicted by (A) RIsearch2, (B) BLAST, and (C) GUUGle.
Fig 3
Fig 3. Overall cluster size distribution of miRNA binding sites predicted by RIsearch2.
The plot shows the size distributions obtained for real and shuffled genomes when pooling the results for 2578 mature human miRNAs. For each miRNA, we used RIsearch2 to predict binding sites and clustered them using MCL with inflation factor 3.5.
Fig 4
Fig 4. Genomic context of the sponge candidates.
The upper bar chart shows the percentage for the different types of transcripts in the genome based on GENCODE and circBase, and their percentage within our sponge predictions are calculated after we assign annotations to the predicted sponge candidates. For each type of transcript, we calculate the percentage of their nucleotides under whole genome and annotated sponges. Then we can evaluate the enrichment via comparing the percent between sponges and whole genome. There is big overlap between PCGs and circRNAs, so we further divide them into “PCG not circRNA”, “circRNA not PCG” and “circRNA and PCG”. They refer to PCGs not overlapping with circRNAs, circRNAs not overlapping with PCGs, and PCGs overlapping with circRNAs, respectively. The lower bar chart shows the percentage of nucleotides located in intron, exon, 3’ UTR, and 5’ UTR for all annotated PCG sponge candidates. All percentages are calculated based on the number of nucleotides, excluding masked repeats, and are strand-sensitive.
Fig 5
Fig 5. Web resource of miRNA sponge candidates.
To illustrate the web resource, we show the results for miR-7. (A) When searching for a miRNA, the user is presented with an overview table of the corresponding miRNA sponge candidates. In case of miR-7, our pipeline suggests four sponge candidates, the top scoring of which is the known sponge ciRS-7 (named hsa_circ_0001946 in circBase). Clicking detail opens a page with detailed properties of this sponge candidate. (B) Clicking the coordinate of a sponge opens the UCSC genome browser with tracks showing conservation, accessibility profile (probability of the bases being unpaired in the RNA structure), and target profile (binding energies for the miRNA as predicted by RIsearch2). In the example, we show the region chrX: 139 865 280–139 866 947 (+), which corresponds to ciRS-7.

Similar articles

See all similar articles

Cited by 3 articles


    1. Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116(2):281–297. 10.1016/S0092-8674(04)00045-5 - DOI - PubMed
    1. Lim LP, Lau NC, Garrett-Engele P, Grimson A, Schelter JM, Castle J, et al. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature. 2005;116(2):769–773. 10.1038/nature03315 - DOI - PubMed
    1. Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215–233. 10.1016/j.cell.2009.01.002 - DOI - PMC - PubMed
    1. Gangaraju VK, Lin H. MicroRNAs: key regulators of stem cells. Nat Rev Mol Cell Biol. 2009;10(2):116–125. 10.1038/nrm2621 - DOI - PMC - PubMed
    1. Kozomara A, Griffiths-Jones S. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011;39:D152–157. 10.1093/nar/gkq1027 - DOI - PMC - PubMed

Publication types

Grant support

This project was mainly financed by University of Copenhagen with additional support from the Novo Nordisk Foundation [NNF14CC0001], the Danish Center for Scientific Computing (DCSC/DEiC), and Innovation Fund Denmark (Programme Commission on Strategic Growth Technologies) [0603-00320B].