Assessment of ambiguous base calls in HIV-1 pol population sequences as a biomarker for identification of recent infections in HIV-1 incidence studies

J Clin Microbiol. 2014 Aug;52(8):2977-83. doi: 10.1128/JCM.03289-13. Epub 2014 Jun 11.


An increase in the proportion of ambiguous base calls in HIV-1 pol population sequences during the course of infection has been demonstrated in different study populations, and sequence ambiguity thresholds to classify infections as recent or nonrecent have been suggested. The aim of our study was to evaluate sequence ambiguities as a candidate biomarker for use in an HIV-1 incidence assay using samples from antiretroviral treatment-naive seroconverters with known durations of infection (German HIV-1 Seroconverter Study). We used 2,203 HIV-1 pol population sequences derived from 1,334 seroconverters to assess the sequence ambiguity method (SAM). We then compared the serological incidence BED capture enzyme immunoassay (BED-CEIA) with the SAM for a subset of 723 samples from 495 seroconverters and evaluated a multianalyte algorithm that includes BED-CEIA results, SAM results, viral loads, and CD4 cell counts for 453 samples from 325 seroconverters. We observed a significant increase in the proportion of sequence ambiguities with the duration of infection. A sequence ambiguity threshold of 0.5% best identified recent infections with 76.7% accuracy. The mean duration of recency was determined to be 208 (95% confidence interval, 196 to 221) days. In the subset analysis, BED-CEIA achieved a significantly higher accuracy than the SAM (84.6 versus 75.5%, P < 0.001) and results were concordant for 64.2% (464/723) of the samples. Also, the multianalyte algorithm did not show better accuracy than the BED-CEIA (83.4 versus 84.3%, P = 0.786). In conclusion, the SAM and the multianalyte algorithm including SAM were inferior to the BED-CEIA, and the proportion of sequence ambiguities is therefore not a preferable biomarker for HIV-1 incidence testing.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • CD4 Lymphocyte Count
  • Cohort Studies
  • Female
  • Genetic Markers*
  • Genetic Variation*
  • Genotype
  • HIV Infections / diagnosis*
  • HIV Infections / virology*
  • HIV-1 / classification
  • HIV-1 / genetics*
  • HIV-1 / isolation & purification
  • Humans
  • Male
  • Prospective Studies
  • Viral Load
  • pol Gene Products, Human Immunodeficiency Virus / genetics*


  • Genetic Markers
  • pol Gene Products, Human Immunodeficiency Virus