Predicting in vivo binding sites of RNA-binding proteins using mRNA secondary structure

RNA. 2010 Jun;16(6):1096-107. doi: 10.1261/rna.2017210. Epub 2010 Apr 23.

Abstract

While many RNA-binding proteins (RBPs) bind RNA in a sequence-specific manner, their sequence preferences alone do not distinguish known target RNAs from other potential targets that are coexpressed and contain the same sequence motifs. Recently, the mRNA targets of dozens of RNA-binding proteins have been identified, facilitating a systematic study of the features of target transcripts. Using these data, we demonstrate that calculating the predicted structural accessibility of a putative RBP binding site allows one to significantly improve the accuracy of predicting in vivo binding for the majority of sequence-specific RBPs. In our new in silico approach, accessibility is predicted based solely on the mRNA sequence without consideration of the locations of bound trans-factors; as such, our results suggest a greater than previously anticipated role for intrinsic mRNA secondary structure in determining RBP binding target preference. Target site accessibility aids in predicting target transcripts and the binding sites for RBPs with a range of RNA-binding domains and subcellular functions. Based on this work, we introduce a new motif-finding algorithm that identifies accessible sequence-specific RBP motifs from in vivo binding data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions / genetics
  • Animals
  • Binding Sites
  • Consensus Sequence
  • Conserved Sequence
  • Dinucleotide Repeats
  • Drosophila / genetics
  • Drosophila / metabolism
  • Drosophila Proteins / chemistry
  • Drosophila Proteins / genetics
  • Drosophila Proteins / metabolism
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Predictive Value of Tests
  • RNA, Messenger / chemistry
  • RNA, Messenger / genetics*
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism*
  • Transcription, Genetic*

Substances

  • 3' Untranslated Regions
  • Drosophila Proteins
  • RNA, Messenger
  • RNA-Binding Proteins
  • pum protein, Drosophila