MicroRNAs (miRNAs), endogenous non-coding RNA regulators, post-transcriptionally inhibit the expression of their target genes. Several tools have been developed for predicting annotated known miRNAs, but there is no consensus about how to select the most suitable method for any given species. In this study, eight miRNA prediction tools (mirnovo, miRPlant, miRDeep-P2, miRExpress, miRkwood, miRDeep2, miR-PREFeR, and sRNAbench) were selected for evaluation. High-throughput small RNA sequencing data from four plant species (including C3 and C4 species, and both monocots and dicots, i.e., Arabidopsis thaliana, Oryza sativa, Triticum aestivum, and Zea mays) were used for the analysis. The sensitivity, accuracy, area under the curve, consistency, duration, and RAM usage of the known miRNA predictions were evaluated for each tool. The miRNA annotations were obtained using miRBase and sRNAanno. Algorithms, such as random forest, BLAST, and receiver operating characteristic curves, were used to evaluate accuracy. Of the tools evaluated, sRNAbench was found to be the most accurate, miRDeep-P2 was the most sensitive, miRDeep-P2 was the fastest, and miRkwood had the highest memory usage. Due to its large genome size, only three tools were able to successfully predict known miRNAs in wheat (Triticum aestivum). Our results enable us to recommend the tool best suited to a variety of researcher needs, which we hope will reduce confusion and enhance future work.
Keywords: known miRNAs; random forest; receiver operating characteristic; sRNA‐Seq.
© 2021 Li et al. Applications in Plant Sciences is published by Wiley Periodicals LLC on behalf of the Botanical Society of America.