Prediction of signal peptides in protein sequences by neural networks

Acta Biochim Pol. 2008;55(2):261-7. Epub 2008 May 26.

Abstract

We present here a neural network-based method for detection of signal peptides (abbreviation used: SP) in proteins. The method is trained on sequences of known signal peptides extracted from the Swiss-Prot protein database and is able to work separately on prokaryotic and eukaryotic proteins. A query protein is dissected into overlapping short sequence fragments, and then each fragment is analyzed with respect to the probability of it being a signal peptide and containing a cleavage site. While the accuracy of the method is comparable to that of other existing prediction tools, it provides a significantly higher speed and portability. The accuracy of cleavage site prediction reaches 73% on heterogeneous source data that contains both prokaryotic and eukaryotic sequences while the accuracy of discrimination between signal peptides and non-signal peptides is above 93% for any source dataset. As a consequence, the method can be easily applied to genome-wide datasets. The software can be downloaded freely from http://rpsp.bioinfo.pl/RPSP.tar.gz.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Neural Networks, Computer*
  • Protein Sorting Signals / genetics*
  • Proteins / chemistry
  • Proteins / genetics*
  • Sequence Analysis, Protein / methods*
  • Sequence Analysis, Protein / statistics & numerical data
  • Software
  • Software Design

Substances

  • Protein Sorting Signals
  • Proteins