HybridSucc: A Hybrid-learning Architecture for General and Species-specific Succinylation Site Prediction

Genomics Proteomics Bioinformatics. 2020 Apr;18(2):194-207. doi: 10.1016/j.gpb.2019.11.010. Epub 2020 Aug 28.

Abstract

As an important protein acylation modification, lysine succinylation (Ksucc) is involved in diverse biological processes, and participates in human tumorigenesis. Here, we collected 26,243 non-redundant known Ksucc sites from 13 species as the benchmark data set, combined 10 types of informative features, and implemented a hybrid-learning architecture by integrating deep-learning and conventional machine-learning algorithms into a single framework. We constructed a new tool named HybridSucc, which achieved area under curve (AUC) values of 0.885 and 0.952 for general and human-specific prediction of Ksucc sites, respectively. In comparison, the accuracy of HybridSucc was 17.84%-50.62% better than that of other existing tools. Using HybridSucc, we conducted a proteome-wide prediction and prioritized 370 cancer mutations that change Ksucc states of 218 important proteins, including PKM2, SHMT2, and IDH2. We not only developed a high-profile tool for predicting Ksucc sites, but also generated useful candidates for further experimental consideration. The online service of HybridSucc can be freely accessed for academic research at http://hybridsucc.biocuckoo.org/.

Keywords: Deep neural network; Deep-learning; Lysine succinylation; Machine-learning; Post-translational modification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acylation
  • Algorithms*
  • Amino Acid Sequence
  • Area Under Curve
  • Humans
  • Lysine / metabolism
  • Machine Learning*
  • Neoplasms / metabolism
  • Proteins / metabolism*
  • Proteome / metabolism
  • ROC Curve
  • Species Specificity
  • Succinic Acid / metabolism*

Substances

  • Proteins
  • Proteome
  • Succinic Acid
  • Lysine