Mal-Lys: prediction of lysine malonylation sites in proteins integrated sequence-based features with mRMR feature selection

Sci Rep. 2016 Dec 2:6:38318. doi: 10.1038/srep38318.

Abstract

Lysine malonylation is an important post-translational modification (PTM) in proteins, and has been characterized to be associated with diseases. However, identifying malonyllysine sites still remains to be a great challenge due to the labor-intensive and time-consuming experiments. In view of this situation, the establishment of a useful computational method and the development of an efficient predictor are highly desired. In this study, a predictor Mal-Lys which incorporated residue sequence order information, position-specific amino acid propensity and physicochemical properties was proposed. A feature selection method of minimum Redundancy Maximum Relevance (mRMR) was used to select optimal ones from the whole features. With the leave-one-out validation, the value of the area under the curve (AUC) was calculated as 0.8143, whereas 6-, 8- and 10-fold cross-validations had similar AUC values which showed the robustness of the predictor Mal-Lys. The predictor also showed satisfying performance in the experimental data from the UniProt database. Meanwhile, a user-friendly web-server for Mal-Lys is accessible at http://app.aporc.org/Mal-Lys/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Carbonic Anhydrases / metabolism*
  • Computational Biology / methods
  • Databases, Protein
  • Internet
  • Lysine / metabolism*
  • Malonates / metabolism*
  • Mice
  • Probability
  • Protein Processing, Post-Translational*

Substances

  • Malonates
  • malonic acid
  • Carbonic Anhydrases
  • Lysine