PredSL: a tool for the N-terminal sequence-based prediction of protein subcellular localization

Genomics Proteomics Bioinformatics. 2006 Feb;4(1):48-55. doi: 10.1016/S1672-0229(06)60016-8.

Abstract

The ability to predict the subcellular localization of a protein from its sequence is of great importance, as it provides information about the protein's function. We present a computational tool, PredSL, which utilizes neural networks, Markov chains, profile hidden Markov models, and scoring matrices for the prediction of the subcellular localization of proteins in eukaryotic cells from the N-terminal amino acid sequence. It aims to classify proteins into five groups: chloroplast, thylakoid, mitochondrion, secretory pathway, and "other". When tested in a five-fold cross-validation procedure, PredSL demonstrates 86.7% and 87.1% overall accuracy for the plant and non-plant datasets, respectively. Compared with TargetP, which is the most widely used method to date, and LumenP, the results of PredSL are comparable in most cases. When tested on the experimentally verified proteins of the Saccharomyces cerevisiae genome, PredSL performs comparably if not better than any available algorithm for the same task. Furthermore, PredSL is the only method capable for the prediction of these subcellular localizations that is available as a stand-alone application through the URL:http://bioinformatics.biol.uoa.gr/PredSL/.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Protein
  • Organelles / metabolism*
  • Peptide Fragments / metabolism*
  • Protein Sorting Signals* / physiology
  • Proteins / metabolism*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Peptide Fragments
  • Protein Sorting Signals
  • Proteins