CarSPred: a computational tool for predicting carbonylation sites of human proteins

PLoS One. 2014 Oct 27;9(10):e111478. doi: 10.1371/journal.pone.0111478. eCollection 2014.

Abstract

Protein carbonylation is one of the most pervasive oxidative stress-induced post-translational modifications (PTMs), which plays a significant role in the etiology and progression of several human diseases. It has been regarded as a biomarker of oxidative stress due to its relatively early formation and stability compared with other oxidative PTMs. Only a subset of proteins is prone to carbonylation and most carbonyl groups are formed from lysine (K), arginine (R), threonine (T) and proline (P) residues. Recent advancements in analysis of the PTM by mass spectrometry provided new insights into the mechanisms of protein carbonylation, such as protein susceptibility and exact modification sites. However, the experimental approaches to identifying carbonylation sites are costly, time-consuming and capable of processing a limited number of proteins, and there is no bioinformatics method or tool devoted to predicting carbonylation sites of human proteins so far. In the paper, a computational method is proposed to identify carbonylation sites of human proteins. The method extracted four kinds of features and combined the minimum Redundancy Maximum Relevance (mRMR) feature selection criterion with weighted support vector machine (WSVM) to achieve total accuracies of 85.72%, 85.95%, 83.92% and 85.72% for K, R, T and P carbonylation site predictions respectively using 10-fold cross-validation. The final optimal feature sets were analysed, the position-specific composition and hydrophobicity environment of flanking residues of modification sites were discussed. In addition, a software tool named CarSPred has been developed to facilitate the application of the method. Datasets and the software involved in the paper are available at https://sourceforge.net/projects/hqlstudio/files/CarSPred-1.0/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Models, Biological*
  • Protein Carbonylation*
  • Proteome / chemistry
  • Proteome / metabolism*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteome

Grant support

This work was supported by grants from National Natural Science Foundation of China (No. 61105021) and Ph.D. Program Foundation of the Ministry of Education of China (No. 20110201110010). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.