Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences

Proteomics. 2004 Jun;4(6):1581-90. doi: 10.1002/pmic.200300776.


Probably more than 25% of the proteins encoded by the nuclear genomes of multicellular eukaryotes are targeted to membrane-bound compartments by N-terminal targeting signals. The major signals are those for the endoplasmic reticulum, the mitochondria, and in plants, plastids. The most abundant of these targeted proteins are well-known and well-studied, but a large proportion remain unknown, including most of those involved in regulation of organellar gene expression or regulation of biochemical pathways. The discovery and characterization of these proteins by biochemical means will be long and difficult. An alternative method is to identify candidate organellar proteins via their characteristic N-terminal targeting sequences. We have developed a neural network-based approach (Predotar--Prediction of Organelle Targeting sequences) for identifying genes encoding these proteins amongst eukaryotic genome sequences. The power of this approach for identifying and annotating novel gene families has been illustrated by the discovery of the pentatricopeptide repeat family.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis Proteins / genetics*
  • Arabidopsis Proteins / metabolism
  • Computational Biology
  • Computer Simulation
  • Gene Targeting
  • Humans
  • Neural Networks, Computer
  • Organelles / metabolism
  • Proteome*
  • Saccharomyces cerevisiae Proteins / genetics*
  • Saccharomyces cerevisiae Proteins / metabolism
  • Sensitivity and Specificity


  • Arabidopsis Proteins
  • Proteome
  • Saccharomyces cerevisiae Proteins