Predicting subcellular localization of proteins based on their N-terminal amino acid sequence

J Mol Biol. 2000 Jul 21;300(4):1005-16. doi: 10.1006/jmbi.2000.3903.

Abstract

A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-terminal sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and "other" localizations with a success rate of 85% (plant) or 90% (non-plant) on redundancy-reduced test sets. From a TargetP analysis of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens protein set, we estimate that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%. TargetP also predicts cleavage sites with levels of correctly predicted sites ranging from approximately 40% to 50% (chloroplastic and mitochondrial presequences) to above 70% (secretory signal peptides). TargetP is available as a web-server at http://www.cbs.dtu.dk/services/TargetP/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Arabidopsis*
  • Biological Transport
  • Chloroplasts / chemistry
  • Chloroplasts / metabolism
  • Cytoplasm / chemistry
  • Cytoplasm / metabolism
  • Databases, Factual
  • Humans
  • Internet
  • Mitochondria / chemistry
  • Mitochondria / metabolism
  • Molecular Sequence Data
  • Neural Networks, Computer
  • Nuclear Proteins / chemistry
  • Nuclear Proteins / genetics
  • Nuclear Proteins / metabolism
  • Plant Proteins / chemistry
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Protein Processing, Post-Translational
  • Protein Sorting Signals / chemistry
  • Protein Sorting Signals / physiology*
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software

Substances

  • Nuclear Proteins
  • Plant Proteins
  • Protein Sorting Signals
  • Proteins