CLIP-based prediction of mammalian microRNA binding sites

Nucleic Acids Res. 2013 Aug;41(14):e138. doi: 10.1093/nar/gkt435. Epub 2013 May 22.


Prediction and validation of microRNA (miRNA) targets are essential for understanding functions of miRNAs in gene regulation. Crosslinking immunoprecipitation (CLIP) allows direct identification of a huge number of Argonaute-bound target sequences that contain miRNA binding sites. By analysing data from CLIP studies, we identified a comprehensive list of sequence, thermodynamic and target structure features that are essential for target binding by miRNAs in the 3' untranslated region (3' UTR), coding sequence (CDS) region and 5' untranslated region (5' UTR) of target messenger RNA (mRNA). The total energy of miRNA:target hybridization, a measure of target structural accessibility, is the only essential feature common for both seed and seedless sites in all three target regions. Furthermore, evolutionary conservation is an important discriminating feature for both seed and seedless sites. These features enabled us to develop novel statistical models for the predictions of both seed sites and broad classes of seedless sites. Through both intra-dataset validation and inter-dataset validation, our approach showed major improvements over established algorithms for predicting seed sites and a class of seedless sites. Furthermore, we observed good performance from cross-species validation, suggesting that our prediction framework can be valuable for broad application to other mammalian species and beyond. Transcriptome-wide binding site predictions enabled by our approach will greatly complement the available CLIP data, which only cover small fractions of transcriptomes and known miRNAs due to non-detectable levels of expression. Software and database tools based on the prediction models have been developed and are available through Sfold web server at

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions
  • 5' Untranslated Regions
  • Algorithms
  • Argonaute Proteins / metabolism
  • Binding Sites
  • Databases, Nucleic Acid
  • HEK293 Cells
  • Humans
  • Immunoprecipitation / methods
  • Logistic Models
  • MicroRNAs / metabolism*
  • RNA, Messenger / chemistry*
  • RNA, Messenger / metabolism
  • Software


  • 3' Untranslated Regions
  • 5' Untranslated Regions
  • Argonaute Proteins
  • MicroRNAs
  • RNA, Messenger