Identifying integration sites of the HIV-1 genome with intact and aberrant ends through deep sequencing

J Virol Methods. 2019 May;267:59-65. doi: 10.1016/j.jviromet.2019.03.004. Epub 2019 Mar 8.


Paired-end deep sequencing is a powerful tool to investigate integration sites of the HIV-1 genome in infected cells. Integration sites of HIV-1 proviral DNA carrying intact LTR ends have been well documented. In contrast, integration sites of proviral DNA with aberrant ends, which emerge infrequently but can also induce replication-competent viruses, have not been extensively examined, in part, because of the lack of a suitable bioinformatics method for deep sequencing. Here, we report a novel bioinformatics protocol, named the VINSSRM, to search for integration sites of proviral DNA carrying intact and aberrant LTR ends using paired-end deep sequencing data. The protocol incorporates split-read mapping to assign viral and human genome parts within read sequences and overlapping paired-end read merging to construct long error-corrected sequences. The VINSSRM not only consistently detects integration sites similar to the conventional method but also provides information on additional integration sites, including those of proviral DNA with aberrant ends, which were mainly found in non-exonic regions of the human genome. Therefore, the VINSSRM may help us to understand HIV-1 integration, persistence of infected cells, and viral latency.

Keywords: Bioinformatics; Clonal expansion; Deep sequencing; HIV-1; Integration site.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • DNA, Viral / genetics
  • Genome, Viral*
  • HIV Infections / virology
  • HIV-1 / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Proviruses / genetics*
  • Sensitivity and Specificity
  • Virus Integration / genetics*


  • DNA, Viral