Improve the prediction of RNA-binding residues using structural neighbours

Protein Pept Lett. 2010 Mar;17(3):287-96. doi: 10.2174/092986610790780279.

Abstract

The interactions between RNA-binding proteins (RBPs) with RNA play key roles in managing some of the cell's basic functions. The identification and prediction of RNA binding sites is important for understanding the RNA-binding mechanism. Computational approaches are being developed to predict RNA-binding residues based on the sequence- or structure-derived features. To achieve higher prediction accuracy, improvements on current prediction methods are necessary. We identified that the structural neighbors of RNA-binding and non-RNA-binding residues have different amino acid compositions. Combining this structure-derived feature with evolutionary (PSSM) and other structural information (secondary structure and solvent accessibility) significantly improves the predictions over existing methods. Using a multiple linear regression approach and 6-fold cross validation, our best model can achieve an overall correct rate of 87.8% and MCC of 0.47, with a specificity of 93.4%, correctly predict 52.4% of the RNA-binding residues for a dataset containing 107 non-homologous RNA-binding proteins. Compared with existing methods, including the amino acid compositions of structure neighbors lead to clearly improvement. A web server was developed for predicting RNA binding residues in a protein sequence (or structure),which is available at http://mcgill.3322.org/RNA/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Linear Models
  • Protein Binding
  • Protein Interaction Domains and Motifs / genetics*
  • Proteomics / methods*
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism
  • ROC Curve
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods*

Substances

  • RNA-Binding Proteins