Background: Human papillomavirus (HPV) is a well-established driver of malignant transformation at a number of sites, including head and neck, cervical, vulvar, anorectal, and penile squamous cell carcinomas; however, the impact of HPV integration into the host human genome on this process remains largely unresolved. This is due to the technical challenge of identifying HPV integration sites, which includes limitations of existing informatics approaches to discovering viral-host breakpoints from low-read-coverage sequencing data.
Methods: To overcome this limitation, the authors developed SearcHPV, a new HPV detection pipeline based on targeted capture technology, and applied the algorithm to targeted capture data. They performed an integrated analysis of SearcHPV-defined breakpoints with genome-wide linked-read sequencing to identify potential HPV-related structural variations.
Results: Through an analysis of HPV+ models, the authors showed that SearcHPV detected HPV-host integration sites with a higher sensitivity and specificity than 2 other commonly used HPV detection callers. SearcHPV uncovered HPV integration sites adjacent to known cancer-related genes, including TP63, MYC, and TRAF2, and near regions of large structural variation. The authors further validated the junction contig assembly feature of SearcHPV, which helped to accurately identify viral-host junction breakpoint sequences. They found that viral integration occurred through a variety of DNA repair mechanisms, including nonhomologous end joining, alternative end joining, and microhomology-mediated repair.
Conclusions: In summary, SearcHPV is a new optimized tool for the accurate detection of HPV-human integration sites from targeted capture DNA sequencing data.
Keywords: DNA sequence analysis; bioinformatics; genomics; papillomavirus infections; squamous cell carcinoma; virus integration.
© 2021 American Cancer Society.