iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition

PLoS One. 2013;8(2):e55844. doi: 10.1371/journal.pone.0055844. Epub 2013 Feb 7.

Abstract

Posttranslational modifications (PTMs) of proteins are responsible for sensing and transducing signals to regulate various cellular functions and signaling events. S-nitrosylation (SNO) is one of the most important and universal PTMs. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop computational methods for timely identifying the exact SNO sites in proteins because this kind of information is very useful for both basic research and drug development. Here, a new predictor, called iSNO-PseAAC, was developed for identifying the SNO sites in proteins by incorporating the position-specific amino acid propensity (PSAAP) into the general form of pseudo amino acid composition (PseAAC). The predictor was implemented using the conditional random field (CRF) algorithm. As a demonstration, a benchmark dataset was constructed that contains 731 SNO sites and 810 non-SNO sites. To reduce the homology bias, none of these sites were derived from the proteins that had [Formula: see text] pairwise sequence identity to any other. It was observed that the overall cross-validation success rate achieved by iSNO-PseAAC in identifying nitrosylated proteins on an independent dataset was over 90%, indicating that the new predictor is quite promising. Furthermore, a user-friendly web-server for iSNO-PseAAC was established at http://app.aporc.org/iSNO-PseAAC/, by which users can easily obtain the desired results without the need to follow the mathematical equations involved during the process of developing the prediction method. It is anticipated that iSNO-PseAAC may become a useful high throughput tool for identifying the SNO sites, or at the very least play a complementary role to the existing methods in this area.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids
  • Cysteine / metabolism*
  • Databases, Protein
  • Internet
  • Protein Processing, Post-Translational*
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Reproducibility of Results
  • Software*

Substances

  • Amino Acids
  • Proteins
  • Cysteine

Grant support

This work is partially supported by the National Natural Science Foundation of China (No. 11101029, No. 10971223, No. 60970091, No. 11131009, No. 11071013 ) and the Fundamental Research Funds for the Central Universities, NCET of China (No. NCET-11-0574), and Knowledge Innovation Program of the Chinese Academy of Sciences (No. kjcx-yw-s7). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.