An efficient support vector machine approach for identifying protein S-nitrosylation sites

Protein Pept Lett. 2011 Jun;18(6):573-87. doi: 10.2174/092986611795222731.

Abstract

Protein S-nitrosylation plays a key and specific role in many cellular processes. Detecting possible S-nitrosylated substrates and their corresponding exact sites is crucial for studying the mechanisms of these biological processes. Comparing with the expensive and time-consuming biochemical experiments, the computational methods are attracting considerable attention due to their convenience and fast speed. Although some computational models have been developed to predict S-nitrosylation sites, their accuracy is still low. In this work,we incorporate support vector machine to predict protein S-nitrosylation sites. After a careful evaluation of six encoding schemes, we propose a new efficient predictor, CPR-SNO, using the coupling patterns based encoding scheme. The performance of our CPR-SNO is measured with the area under the ROC curve (AUC) of 0.8289 in 10-fold cross validation experiments, which is significantly better than the existing best method GPS-SNO 1.0's 0.685 performance. In further annotating large-scale potential S-nitrosylated substrates, CPR-SNO also presents an encouraging predictive performance. These results indicate that CPR-SNO can be used as a competitive protein S-nitrosylation sites predictor to the biological community. Our CPR-SNO has been implemented as a web server and is available at http://math.cau.edu.cn/CPR -SNO/CPR-SNO.html.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Artificial Intelligence*
  • Binding Sites
  • Humans
  • Internet
  • Mice
  • Molecular Sequence Annotation
  • Nitrogen Oxides / metabolism*
  • Pattern Recognition, Automated
  • Protein Processing, Post-Translational*
  • Proteins / chemistry*
  • Proteins / metabolism*
  • ROC Curve

Substances

  • Nitrogen Oxides
  • Proteins