A new ensemble coevolution system for detecting HIV-1 protein coevolution

Biol Direct. 2015 Jan 7;10:1. doi: 10.1186/s13062-014-0031-8.


Background: A key challenge in the field of HIV-1 protein evolution is the identification of coevolving amino acids at the molecular level. In the past decades, many sequence-based methods have been designed to detect position-specific coevolution within and between different proteins. However, an ensemble coevolution system that integrates different methods to improve the detection of HIV-1 protein coevolution has not been developed.

Results: We integrated 27 sequence-based prediction methods published between 2004 and 2013 into an ensemble coevolution system. This system allowed combinations of different sequence-based methods for coevolution predictions. Using HIV-1 protein structures and experimental data, we evaluated the performance of individual and combined sequence-based methods in the prediction of HIV-1 intra- and inter-protein coevolution. We showed that sequence-based methods clustered according to their methodology, and a combination of four methods outperformed any of the 27 individual methods. This four-method combination estimated that HIV-1 intra-protein coevolving positions were mainly located in functional domains and physically contacted with each other in the protein tertiary structures. In the analysis of HIV-1 inter-protein coevolving positions between Gag and protease, protease drug resistance positions near the active site mostly coevolved with Gag cleavage positions (V128, S373-T375, A431, F448-P453) and Gag C-terminal positions (S489-Q500) under selective pressure of protease inhibitors.

Conclusions: This study presents a new ensemble coevolution system which detects position-specific coevolution using combinations of 27 different sequence-based methods. Our findings highlight key coevolving residues within HIV-1 structural proteins and between Gag and protease, shedding light on HIV-1 intra- and inter-protein coevolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Computational Biology / methods*
  • Databases, Protein
  • Evolution, Molecular*
  • Gene Products, gag / chemistry
  • HIV Protease / genetics*
  • HIV-1 / genetics*
  • Humans
  • Models, Molecular
  • Models, Statistical
  • Protein Binding
  • Protein Structure, Tertiary
  • Reproducibility of Results
  • Viral Proteins / chemistry
  • gag Gene Products, Human Immunodeficiency Virus / genetics*


  • Gene Products, gag
  • Viral Proteins
  • gag Gene Products, Human Immunodeficiency Virus
  • HIV Protease