MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction

Bioinformatics. 2017 Dec 15;33(24):3909-3916. doi: 10.1093/bioinformatics/btx496.

Abstract

Motivation: Computational methods for phosphorylation site prediction play important roles in protein function studies and experimental design. Most existing methods are based on feature extraction, which may result in incomplete or biased features. Deep learning as the cutting-edge machine learning method has the ability to automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of phosphorylation site prediction.

Results: We present MusiteDeep, the first deep-learning framework for predicting general and kinase-specific phosphorylation sites. MusiteDeep takes raw sequence data as input and uses convolutional neural networks with a novel two-dimensional attention mechanism. It achieves over a 50% relative improvement in the area under the precision-recall curve in general phosphorylation site prediction and obtains competitive results in kinase-specific prediction compared to other well-known tools on the benchmark data.

Availability and implementation: MusiteDeep is provided as an open-source tool available at https://github.com/duolinwang/MusiteDeep.

Contact: xudong@missouri.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer
  • Phosphoproteins / chemistry*
  • Phosphorylation
  • Protein Kinases / metabolism
  • Proteins / metabolism
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Phosphoproteins
  • Proteins
  • Protein Kinases