PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites

PLoS One. 2012;7(11):e50300. doi: 10.1371/journal.pone.0050300. Epub 2012 Nov 29.


The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database) with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate sequence using machine learning techniques. It is freely available at

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Artificial Intelligence
  • Catalysis
  • Cattle
  • Computational Biology / methods
  • Granzymes / chemistry
  • Humans
  • Hydrolysis
  • Mice
  • Models, Statistical
  • Peptide Hydrolases / chemistry*
  • Peptides / chemistry
  • Protein Binding
  • Protein Conformation
  • Protein Processing, Post-Translational
  • Proteins / chemistry*
  • ROC Curve
  • Software
  • Solvents / chemistry
  • Substrate Specificity


  • Peptides
  • Proteins
  • Solvents
  • Peptide Hydrolases
  • Granzymes

Grant support

This work was supported by grants from the National Health and Medical Research Council of Australia (NHMRC) (490989), the Australian Research Council (ARC) (LP110200333), the Chinese Academy of Sciences (CAS), the Japan Society for the Promotion of Science (S11156), the Knowledge Innovation Program of CAS (KSCX2-EW-G-8) and Tianjin Municipal Science & Technology Commission (10ZCKFSY05600). JS is an NHMRC Peter Doherty Fellow and a Recipient of the Hundred Talents Program of CAS. AJP is an NHMRC Peter Doherty Fellow. JCW is an ARC Federation Fellow and an honorary NHMRC Principal Research Fellow. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.