Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites

Nat Biotechnol. 2008 Jun;26(6):685-94. doi: 10.1038/nbt1408. Epub 2008 May 25.


We introduce human proteome-derived, database-searchable peptide libraries for characterizing sequence-specific protein interactions. To identify endoprotease cleavage sites, we used peptides in such libraries with protected primary amines to simultaneously determine sequence preferences on the N-terminal (nonprime P) and C-terminal (prime P') sides of the scissile bond. Prime-side cleavage products were tagged with biotin, isolated and identified by tandem mass spectrometry, and the corresponding nonprime-side sequences were derived from human proteome databases using bioinformatics. Identification of hundreds to over 1,000 individual cleaved peptides allows the consensus protease cleavage site and subsite cooperativity to be readily determined from P6 to P6'. For the highly specific GluC protease, >95% of the 558 cleavage sites identified displayed the canonical selectivity. For the broad-specificity matrix metalloproteinase 2, >1,200 peptidic cleavage sites were identified. Profiling of HIV protease 1, caspase 3, caspase 7, cathepsins K and G, elastase and thrombin showed that this approach is broadly applicable to all mechanistic classes of endoproteases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Binding Sites
  • Databases, Protein*
  • Drug Delivery Systems / methods*
  • Information Storage and Retrieval / methods
  • Molecular Sequence Data
  • Peptide Hydrolases / chemistry*
  • Peptide Library*
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Proteome / chemistry*
  • Sequence Analysis, Protein / methods


  • Peptide Library
  • Proteome
  • Peptide Hydrolases