Deep Sequencing of the HIV-1 env Gene Reveals Discrete X4 Lineages and Linkage Disequilibrium between X4 and R5 Viruses in the V1/V2 and V3 Variable Regions

J Virol. 2016 Jul 27;90(16):7142-58. doi: 10.1128/JVI.00441-16. Print 2016 Aug 15.


HIV-1 requires the CD4 receptor and a coreceptor (CCR5 [R5 phenotype] or CXCR4 [X4 phenotype]) to enter cells. Coreceptor tropism can be assessed by either phenotypic or genotypic analysis, the latter using bioinformatics algorithms to predict tropism based on the env V3 sequence. We used the Primer ID sequencing strategy with the MiSeq sequencing platform to reveal the structure of viral populations in the V1/V2 and C2/V3 regions of the HIV-1 env gene in 30 late-stage and 6 early-stage subjects. We also used endpoint dilution PCR followed by cloning of env genes to create pseudotyped virus to explore the link between genotypic predictions and phenotypic assessment of coreceptor usage. We found out that the most stringently sequence-based calls of X4 variants (Geno2Pheno false-positive rate [FPR] of ≤2%) formed distinct lineages within the viral population, and these were detected in 24 of 30 late-stage samples (80%), which was significantly higher than what has been seen previously by using other approaches. Non-X4 lineages were not skewed toward lower FPR scores in X4-containing populations. Phenotypic assays showed that variants with an intermediate FPR (2 to 20%) could be either X4/dual-tropic or R5 variants, although the X4 variants made up only about 25% of the lineages with an FPR of <10%, and these variants carried a distinctive sequence change. Phylogenetic analysis of both the V1/V2 and C2/V3 regions showed evidence of recombination within but very little recombination between the X4 and R5 lineages, suggesting that these populations are genetically isolated.

Importance: Primer ID sequencing provides a novel approach to study genetic structures of viral populations. X4 variants may be more prevalent than previously reported when assessed by using next-generation sequencing (NGS) and with a greater depth of sampling than single-genome amplification (SGA). Phylogenetic analysis to identify lineages of sequences with intermediate FPR values may provide additional information for accurately predicting X4 variants by using V3 sequences. Limited recombination occurs between X4 and R5 lineages, suggesting that X4 and R5 variants are genetically isolated and may be replicating in different cell types or that X4/R5 recombinants have reduced fitness.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Amino Acid Sequence
  • Female
  • HIV Infections / genetics*
  • HIV Infections / metabolism
  • HIV Infections / virology
  • HIV-1 / genetics*
  • HIV-1 / isolation & purification
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Linkage Disequilibrium
  • Male
  • Middle Aged
  • Phylogeny
  • Receptors, CCR5 / genetics
  • Receptors, CCR5 / metabolism
  • Receptors, CXCR4 / genetics
  • Receptors, CXCR4 / metabolism
  • Receptors, HIV / classification
  • Receptors, HIV / genetics*
  • Receptors, HIV / metabolism
  • Sequence Homology, Amino Acid
  • Viral Tropism*
  • Virus Attachment
  • env Gene Products, Human Immunodeficiency Virus / chemistry
  • env Gene Products, Human Immunodeficiency Virus / classification
  • env Gene Products, Human Immunodeficiency Virus / genetics*
  • env Gene Products, Human Immunodeficiency Virus / metabolism


  • CXCR4 protein, human
  • Receptors, CCR5
  • Receptors, CXCR4
  • Receptors, HIV
  • env Gene Products, Human Immunodeficiency Virus