Signal-3L 3.0: Improving Signal Peptide Prediction through Combining Attention Deep Learning with Window-Based Scoring

J Chem Inf Model. 2020 Jul 27;60(7):3679-3686. doi: 10.1021/acs.jcim.0c00401. Epub 2020 Jul 1.

Abstract

Signal peptides play an important role in guiding and transferring transmembrane proteins and secreted proteins. In recent years, with the explosive growth of protein sequences, computationally predicting signal peptides and their cleavage sites from protein sequences is highly desired. In this work, we present an improved approach, Signal-3L 3.0, for signal peptide recognition and cleavage-site prediction using a 3-layer hybrid method of integrating deep learning algorithms and window-based scoring. There are three main components in the Signal-3L 3.0 prediction engine: (1) a deep bidirectional long short-term memory (Bi-LSTM) network with a soft self-attention learns abstract features from sequences to determine whether a query protein contains a signal peptide; (2) the statistics propensity window-based cleavage site screening method is applied to generate the set of candidate cleavage sites; (3) the prediction of a conditional random field with a hybrid convolutional neural network (CNN) and Bi-LSTM is fused with the window-based score for identifying the final unique cleavage site. Experimental results on the benchmark datasets show that the new deep learning-driven Signal-3L 3.0 yields promising performance. The online server of Signal-3L 3.0 is available at http://www.csbio.sjtu.edu.cn/bioinf/Signal-3L/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Deep Learning*
  • Neural Networks, Computer
  • Protein Sorting Signals*

Substances

  • Protein Sorting Signals