Determining confidence of predicted interactions between HIV-1 and human proteins using conformal method

Pac Symp Biocomput. 2012;311-22.


Identifying protein-protein interactions (PPI's) is critical for understanding virtually all cellular molecular mechanisms. Previously, predicting PPI's was treated as a binary classification task and has commonly been solved in a supervised setting which requires a positive labeled set of known PPI's and a negative labeled set of non-interacting protein pairs. In those methods, the learner provides the likelihood of the predicted interaction, but without a confidence level associated with each prediction. Here, we apply a conformal prediction framework to make predictions and estimate confidence of the predictions. The conformal predictor uses a function measuring relative 'strangeness' interacting pairs to check whether prediction of a new example added to the sequence of already known PPI's would conform to the 'exchangeability' assumption: distribution of interacting pairs is invariant with any permutations of the pairs. In fact, this is the only assumption we make about the data. Another advantage is that the user can control a number of errors by providing a desirable confidence level. This feature of CP is very useful for a ranking list of possible interactive pairs. In this paper, the conformal method has been developed to deal with just one class - class interactive proteins - while there is not clearly defined of 'non-interactive'pairs. The confidence level helps the biologist in the interpretation of the results, and better assists the choices of pairs for experimental validation. We apply the proposed conformal framework to improve the identification of interacting pairs between HIV-1 and human proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Computational Biology
  • Databases, Protein / statistics & numerical data
  • HIV-1 / pathogenicity*
  • HIV-1 / physiology*
  • Host-Pathogen Interactions / genetics
  • Host-Pathogen Interactions / physiology*
  • Human Immunodeficiency Virus Proteins / physiology*
  • Humans
  • Models, Statistical
  • Protein Interaction Mapping / statistics & numerical data*
  • RNA, Small Interfering / genetics
  • tat Gene Products, Human Immunodeficiency Virus / physiology


  • Human Immunodeficiency Virus Proteins
  • RNA, Small Interfering
  • tat Gene Products, Human Immunodeficiency Virus