Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy

J Theor Biol. 2015 Nov 21:385:153-9. doi: 10.1016/j.jtbi.2015.08.025. Epub 2015 Sep 9.

Abstract

The microRNA (miRNA), a small non-coding RNA molecule, plays an important role in transcriptional and post-transcriptional regulation of gene expression. Its abnormal expression, however, has been observed in many cancers and other disease states, implying that the miRNA molecules are also deeply involved in these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Most existing methods in this regard were based on the strategy in which RNA samples were formulated by a vector formed by their Kmer components. But the length of Kmers must be very short; otherwise, the vector's dimension would be extremely large, leading to the "high-dimension disaster" or overfitting problem. Inspired by the concept of "degenerate energy levels" in quantum mechanics, we introduced the "degenerate Kmer" (deKmer) to represent RNA samples. By doing so, not only we can accommodate long-range coupling effects but also we can avoid the high-dimension problem. Rigorous jackknife tests and cross-species experiments indicated that our approach is very promising. It has not escaped our notice that the deKmer approach can also be applied to many other areas of computational biology. A user-friendly web-server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/miRNA-deKmer/, by which users can easily get their desired results.

Keywords: Degenerate Kmer; False pre-miRNA; Long-range effect; MicroRNA precursor; True pre-miRNA; deKmer web-server.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computational Biology / methods
  • Databases, Genetic
  • Genetic Vectors / genetics
  • Humans
  • Internet
  • MicroRNAs / genetics*
  • RNA Precursors / genetics*
  • Sequence Analysis, RNA / methods*
  • Species Specificity
  • Support Vector Machine

Substances

  • MicroRNAs
  • RNA Precursors