HIV-1 Protease, Reverse Transcriptase, and Integrase Variation

J Virol. 2016 Jun 10;90(13):6058-6070. doi: 10.1128/JVI.00495-16. Print 2016 Jul 1.


HIV-1 protease (PR), reverse transcriptase (RT), and integrase (IN) variability presents a challenge to laboratories performing genotypic resistance testing. This challenge will grow with increased sequencing of samples enriched for proviral DNA such as dried blood spots and increased use of next-generation sequencing (NGS) to detect low-abundance HIV-1 variants. We analyzed PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to characterize variation at each amino acid position, identify mutations indicating APOBEC-mediated G-to-A editing, and identify mutations resulting from selective drug pressure. Forty-seven percent of PR, 37% of RT, and 34% of IN positions had one or more amino acid variants with a prevalence of ≥1%. Seventy percent of PR, 60% of RT, and 60% of IN positions had one or more variants with a prevalence of ≥0.1%. Overall 201 PR, 636 RT, and 346 IN variants had a prevalence of ≥0.1%. The median intersubtype prevalence ratios were 2.9-, 2.1-, and 1.9-fold for these PR, RT, and IN variants, respectively. Only 5.0% of PR, 3.7% of RT, and 2.0% of IN variants had a median intersubtype prevalence ratio of ≥10-fold. Variants at lower prevalences were more likely to differ biochemically and to be part of an electrophoretic mixture compared to high-prevalence variants. There were 209 mutations indicative of APOBEC-mediated G-to-A editing and 326 mutations nonpolymorphic treatment selected. Identification of viruses with a high number of APOBEC-associated mutations will facilitate the quality control of dried blood spot sequencing. Identifying sequences with a high proportion of rare mutations will facilitate the quality control of NGS.

Importance: Most antiretroviral drugs target three HIV-1 proteins: PR, RT, and IN. These proteins are highly variable: many different amino acids can be present at the same position in viruses from different individuals. Some of the amino acid variants cause drug resistance and occur mainly in individuals receiving antiretroviral drugs. Some variants result from a human cellular defense mechanism called APOBEC-mediated hypermutation. Many variants result from naturally occurring mutation. Some variants may represent technical artifacts. We studied PR and RT sequences from >100,000 individuals and IN sequences from >10,000 individuals to quantify variation at each amino acid position in these three HIV-1 proteins. We performed analyses to determine which amino acid variants resulted from antiretroviral drug selection pressure, APOBEC-mediated editing, and naturally occurring variation. Our results provide information essential to clinical, research, and public health laboratories performing genotypic resistance testing by sequencing HIV-1 PR, RT, and IN.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • APOBEC Deaminases / genetics
  • APOBEC Deaminases / metabolism*
  • Amino Acid Sequence
  • Anti-HIV Agents / therapeutic use
  • Drug Resistance, Viral / genetics
  • Genetic Variation*
  • Genotype
  • HIV Infections / drug therapy
  • HIV Infections / virology
  • HIV Integrase / chemistry
  • HIV Integrase / genetics*
  • HIV Protease / chemistry
  • HIV Protease / genetics*
  • HIV Reverse Transcriptase / chemistry
  • HIV Reverse Transcriptase / genetics*
  • HIV-1 / enzymology
  • HIV-1 / genetics*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Mutation
  • Reverse Transcriptase Inhibitors / therapeutic use


  • Anti-HIV Agents
  • Reverse Transcriptase Inhibitors
  • HIV Integrase
  • HIV Reverse Transcriptase
  • HIV Protease
  • p16 protease, Human immunodeficiency virus 1
  • APOBEC Deaminases