Reconstructing the Dynamics of HIV Evolution Within Hosts From Serial Deep Sequence Data

PLoS Comput Biol. 2012;8(11):e1002753. doi: 10.1371/journal.pcbi.1002753. Epub 2012 Nov 1.


At the early stage of infection, human immunodeficiency virus (HIV)-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points) from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS). We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30) months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25) months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Evolution, Molecular*
  • Genetic Fitness / genetics
  • Genotype
  • HIV Infections / virology*
  • HIV-1 / genetics*
  • HIV-1 / pathogenicity
  • High-Throughput Nucleotide Sequencing*
  • Host-Pathogen Interactions / genetics*
  • Humans
  • Molecular Sequence Data
  • Mutation
  • Phenotype
  • Phylogeny
  • RNA, Viral / chemistry
  • RNA, Viral / genetics
  • Receptors, CCR5
  • Receptors, CXCR4


  • CXCR4 protein, human
  • RNA, Viral
  • Receptors, CCR5
  • Receptors, CXCR4