An improved prediction of chloroplast proteins reveals diversities and commonalities in the chloroplast proteomes of Arabidopsis and rice

Gene. 2004 Mar 31;329:11-6. doi: 10.1016/j.gene.2004.01.008.


Proteins that form part of the chloroplast proteome can be identified by computational prediction of the N-terminal presequences (chloroplast transit peptides, cTPs) of their cytoplasmic precursor proteins. The accuracy of four different cTP predictors has been evaluated on a test set of 4500 proteins whose subcellular localization is known, and was found to be substantially lower than previously reported. A combination of cTP prediction programs was superior to any one of the predictors alone. This combination was employed to estimate the size and composition of the chloroplast proteomes of Arabidopsis and rice, and about 2100 (Arabidopsis thaliana) and 4800 (Oryza sativa) different chloroplast proteins with a cTP are predicted to be encoded by their nuclear genomes. A subset of around 900 chloroplast proteins, predominantly derived from the cyanobacterial endosymbiont and with functions mostly related to metabolism, energy and transcription, is shared by the two species. This points to the existence of both conserved nucleus-encoded chloroplast proteins that are predominantly of prokaryotic origin, and a large fraction of taxon-specific chloroplast-targeted proteins, in flowering plants.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis / metabolism*
  • Chloroplasts / metabolism*
  • Cyanobacteria / genetics
  • Genetic Variation
  • Oryza / genetics
  • Oryza / metabolism*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Proteome / genetics
  • Proteome / metabolism*
  • Reproducibility of Results
  • Software*
  • Species Specificity


  • Plant Proteins
  • Proteome