PPero, a Computational Model for Plant PTS1 Type Peroxisomal Protein Prediction

PLoS One. 2017 Jan 3;12(1):e0168912. doi: 10.1371/journal.pone.0168912. eCollection 2017.

Abstract

Well-defined motifs often make it easy to investigate protein function and localization. In plants, peroxisomal proteins are guided to peroxisomes mainly by a conserved type 1 (PTS1) or type 2 (PTS2) targeting signal, and the PTS1 motif is commonly used for peroxisome targeting protein prediction. Currently computational prediction of peroxisome targeted PTS1-type proteins are mostly based on the 3 amino acids PTS1 motif and the adjacent sequence which is less than 14 amino acid residue in length. The potential contribution of the adjacent sequences beyond this short region has never been well investigated in plants. In this work, we develop a bi-profile Bayesian SVM method to extract and learn position-based amino acid features for both PTS1 motifs and their extended adjacent sequences in plants. Our proposed model outperformed other implementations with similar applications and achieved the highest accuracy of 93.6% and 92.6% for Arabidosis and other plant species respectively. A large scale analysis for Arabidopsis, Rice, Maize, Potato, Wheat, and Soybean proteome was conducted using the proposed model and a batch of candidate PTS1 proteins were predicted. The DNA segments corresponding to the C-terminal sequences of 9 selected candidates were cloned and transformed into Arabidopsis for experimental validation, and 5 of them demonstrated peroxisome targeting.

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Amino Acids / metabolism
  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics
  • Bayes Theorem
  • Computational Biology / methods
  • Computer Simulation*
  • Genome, Plant
  • Microscopy, Confocal
  • Oryza / genetics
  • Peroxisome-Targeting Signal 1 Receptor
  • Peroxisomes / metabolism*
  • Plant Proteins / genetics*
  • Probability
  • Protein Sorting Signals / genetics
  • Proteome
  • Receptors, Cytoplasmic and Nuclear / metabolism*
  • Solanum tuberosum / genetics
  • Soybeans / genetics
  • Triticum / genetics
  • Zea mays / genetics

Substances

  • Amino Acids
  • Arabidopsis Proteins
  • Peroxisome-Targeting Signal 1 Receptor
  • Plant Proteins
  • Protein Sorting Signals
  • Proteome
  • Receptors, Cytoplasmic and Nuclear

Grant support

This work is supported by a grant from Shenzhen Science and Technology Committee (grant no. JCYJ20140425184428456), and partially by a grant from Hong Kong Research Grand Council (project no. CUHK3/CRF/11G). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.