A statistical approach for identifying primary substrates of ZSWIM8-mediated microRNA degradation in small-RNA sequencing data

BMC Bioinformatics. 2023 May 12;24(1):195. doi: 10.1186/s12859-023-05306-z.


Background: One strategy for identifying targets of a regulatory factor is to perturb the factor and use high-throughput RNA sequencing to examine the consequences. However, distinguishing direct targets from secondary effects and experimental noise can be challenging when confounding signal is present in the background at varying levels.

Results: Here, we present a statistical modeling strategy to identify microRNAs that are primary substrates of target-directed miRNA degradation (TDMD) mediated by ZSWIM8. This method uses a bi-beta-uniform mixture (BBUM) model to separate primary from background signal components, leveraging the expectation that primary signal is restricted to upregulation and not downregulation upon loss of ZSWIM8. The BBUM model strategy retained the apparent sensitivity and specificity of the previous ad hoc approach but was more robust against outliers, achieved a more consistent stringency, and could be performed using a single cutoff of false discovery rate (FDR).

Conclusions: We developed the BBUM model, a robust statistical modeling strategy to account for background secondary signal in differential expression data. It performed well for identifying primary substrates of TDMD and should be useful for other applications in which the primary regulatory targets are only upregulated or only downregulated. The BBUM model, FDR-correction algorithm, and significance-testing methods are available as an R package at https://github.com/wyppeter/bbum .

Keywords: Differential expression; Mixture models; RNA-seq; Small-RNA sequencing; microRNA.

MeSH terms

  • Algorithms
  • Base Sequence
  • High-Throughput Nucleotide Sequencing / methods
  • MicroRNAs* / genetics
  • Models, Statistical
  • Sequence Analysis, RNA / methods


  • MicroRNAs