CropPAL for discovering divergence in protein subcellular location in crops to support strategies for molecular crop breeding

Plant J. 2020 Nov;104(3):812-827. doi: 10.1111/tpj.14961. Epub 2020 Sep 16.


Agriculture faces increasing demand for yield, higher plant-derived protein content and diversity while facing pressure to achieve sustainability. Although the genomes of many of the important crops have been sequenced, the subcellular locations of most of the encoded proteins remain unknown or are only predicted. Protein subcellular location is crucial in determining protein function and accumulation patterns in plants, and is critical for targeted improvements in yield and resilience. Integrating location data from over 800 studies for 12 major crop species into the cropPAL2020 data collection showed that while >80% of proteins in most species are not localised by experimental data, combining species data or integrating predictions can help bridge gaps at similar accuracy. The collation and integration of over 61 505 experimental localisations and more than 6 million predictions showed that the relative sizes of the protein catalogues located in different subcellular compartments are comparable between crops and Arabidopsis. A comprehensive cross-species comparison showed that between 50% and 80% of the subcellulomes are conserved across species and that conservation only depends to some degree on the phylogenetic relationship of the species. Protein subcellular locations in major biosynthesis pathways are more often conserved than in metabolic pathways. Underlying this conservation is a clear potential for subcellular diversity in protein location between species by means of gene duplication and alternative splicing. Our cropPAL data set and search platform ( provide a comprehensive subcellular proteomics resource to drive compartmentation-based approaches for improving yield, protein composition and resilience in future crop varieties.

Keywords: Brassica napus; Brassica rapa; Glycine max; Hordeum vulgare; Musa acuminata; Oryza sativa; Solanum lycopersicum; Solanum tuberosum; Sorghum bicolor; Tritium aestivum; Vitis vinifera; Zea mays; Proteomics; crops; subcellular localisation; systems biology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Compartmentation
  • Crops, Agricultural / cytology
  • Crops, Agricultural / metabolism*
  • Databases, Protein*
  • Plant Breeding
  • Plant Cells / metabolism
  • Plant Proteins / metabolism*
  • Species Specificity


  • Plant Proteins