Computational analysis of HIV-1 protease protein binding pockets

J Chem Inf Model. 2010 Oct 25;50(10):1759-71. doi: 10.1021/ci100200u.

Abstract

Mutations that arise in HIV-1 protease after exposure to various HIV-1 protease inhibitors have proved to be a difficult aspect in the treatment of HIV. Mutations in the binding pocket of the protease can prevent the protease inhibitor from binding to the protein effectively. In the present study, the crystal structures of 68 HIV-1 proteases complexed with one of the nine FDA approved protease inhibitors from the Protein Data Bank (PDB) were analyzed by (a) identifying the mutational changes with the aid of a developed mutation map and (b) correlating the structure of the binding pockets with the complexed inhibitors. The mutations of each crystal structure were identified by comparing the amino acid sequence of each structure against the HIV-1 wild-type strain HXB2. These mutations were visually presented in the form of a mutation map to analyze mutation patterns corresponding to each protease inhibitor. The crystal structure mutation patterns of each inhibitor (in vitro) were compared against the mutation patterns observed in in vivo data. The in vitro mutation patterns were found to be representative of most of the major in vivo mutations. We then performed a data mining analysis of the binding pockets from each crystal structure in terms of their chemical descriptors to identify important structural features of the HIV-1 protease protein with respect to the binding conformation of the HIV-1 protease inhibitors. Data mining analysis is performed using several classification techniques: Random Forest (RF), linear discriminant analysis (LDA), and logistic regression (LR). We developed two hybrid models, RF-LDA and RF-LR. Random Forest is used as a feature selection proxy, reducing the descriptor space to a few of the most relevant descriptors determined by the classifier. These descriptors are then used to develop the subsequent LDA, LR, and hierarchical classification models. Clustering analysis of the binding pockets using the selected descriptors used to produce the optimal classification models reveals conformational similarities of the ligands in each cluster. This study provides important information in understanding the structural features of HIV-1 protease which cannot be studied by other existing in vivo genomic data sets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Binding Sites
  • Computer Simulation
  • Crystallography, X-Ray
  • Data Mining
  • HIV Infections / drug therapy
  • HIV Infections / enzymology
  • HIV Infections / genetics
  • HIV Protease / chemistry*
  • HIV Protease / genetics*
  • HIV Protease / metabolism
  • HIV Protease Inhibitors / chemistry
  • HIV Protease Inhibitors / pharmacology*
  • HIV-1 / chemistry
  • HIV-1 / enzymology*
  • HIV-1 / genetics
  • Humans
  • Models, Molecular
  • Molecular Sequence Data
  • Mutation*
  • Protein Binding
  • Protein Conformation

Substances

  • HIV Protease Inhibitors
  • HIV Protease
  • p16 protease, Human immunodeficiency virus 1