A vector projection approach to predicting HIV protease cleavage sites in proteins

Proteins. 1993 Jun;16(2):195-204. doi: 10.1002/prot.340160206.


A vector projection method is proposed to predict the cleavability of oligopeptides by extended-specificity site proteases. For an enzyme with eight specificity subsites the substrate octapeptide can be uniquely expressed as a vector in an 8-dimensional space, whose eight bases correspond to the amino acids at the eight subsites, P4, P3, P2, P1, P1', P2', P3', and P4', respectively. The component of such a characteristic vector on each of the eight bases is defined as the frequency of an amino acid occurring at a given site. These frequencies were derived from a set of octapeptides known to be cleaved by HIV protease. The cleavability of an octapeptide can then be estimated from the projection of its characteristic vector on an idealized, optimally cleavable vector. The high ratio of correct prediction vs. total prediction for the data in both the training and the testing sets indicates that the new method is self-consistent and efficient. It provides a rapid and accurate algorithm for analyzing the specificity of any multi-subsite enzyme for which there is no coupling between subsites. In particular, it is useful for predicting the cleavability of an oligopeptide by either HIV-1 or HIV-2 protease, and hence offers a supplementary means for finding effective inhibitors of HIV protease as potential drugs against AIDS.

MeSH terms

  • Amino Acid Sequence
  • HIV Protease / metabolism*
  • HIV-1 / enzymology
  • HIV-2 / enzymology
  • Hydrolysis
  • Models, Chemical
  • Molecular Sequence Data
  • Oligopeptides / metabolism
  • Proteins / chemistry
  • Proteins / metabolism*


  • Oligopeptides
  • Proteins
  • HIV Protease