Motivation: In vitro studies have shown that the most remarkable catalytic features of caspases, a family of cysteineproteases, are their stringent specificity to Asp (D) in the S1 subsite and at least four amino acids to the left of scissile bound. However, there is little information about the substrate recognition patterns in vivo. The prediction and characterization of proteolytic cleavage sites in natural substrates could be useful for uncovering these structural relationships.
Results: PEST-like sequences rich in the amino acids Ser (S), Thr (T), Pro (P), Glu or Asp (E/D), including Asn (N) and Gln (Q) are adjacent structural/sequential elements in the majority of cleavage site regions of the natural caspase substrates described in the literature, supporting its possible implication in the substrate selection by caspases. We developed CaSPredictor, a software which incorporated a PEST-like index and the position-dependent amino acid matrices for prediction of caspase cleavage sites in individual proteins and protein datasets. The program predicted successfully 81% (111/137) of the cleavage sites in experimentally verified caspase substrates not annotated in its internal data file. Its accuracy and confidence was estimated as 80% using ROC methodology. The program was much more efficient in predicting caspase substrates when compared with PeptideCutter and PEPS software. Finally, the program detected potential cleavage sites in the primary sequences of 1644 proteins in a dataset containing 9986 protein entries.
Availability: Requests for software should be made to Dr José E. Belizário
Supplementary information: Supplementary information is available for academic users at site http://icb.usp.br/~farmaco/Jose/CaSpredictorfiles.