Objectives: Human papillomavirus (HPV) influences the pathobiology of Head and Neck Squamous Cell Carcinomas (HSNCCs). While deep learning shows promise in detecting HPV from hematoxylin and eosin (H&E) stained slides, the histologic features utilized remain unclear. This study leverages artificial intelligence (AI) foundation models to characterize histopathologic features associated with HPV presence and objectively describe patterns of variability in the HPV-positive space.
Materials and methods: H&E images from 981 HNSCC patients across public and institutional datasets were analyzed. We used UNI, a foundation model based on self-supervised learning (SSL), to map the landscape of HNSCC histology and identify the axes of SSL features that best separate HPV-positive and HPV-negative tumors. To interpret the histologic features that vary across different regions of this landscape, we used HistoXGAN, a pretrained generative adversarial network (GAN), to generate synthetic histology images from SSL features, which a pathologist rigorously assessed.
Results: Analyzing AI-generated synthetic images found distinctive features of HPV-positive histology, such as smaller, paler, more monomorphic nuclei; purpler, amphophilic cytoplasm; and indistinct cell borders with rounded tumor contours. The SSL feature axes we identified enabled accurate prediction of HPV status from histology, achieving validation sensitivity and specificity of 0.81 and 0.92, respectively. Our analysis subdivided image tiles from HPV-positive histology into three overlapping subtypes: border, inflamed, and stroma.
Conclusion: Foundation-model-derived synthetic pathology images effectively capture HPV-related histology. Our analysis identifies distinct subtypes within HPV-positive HNSCCs and enables accurate, explainable detection of HPV presence directly from histology, offering a valuable approach for low-resource clinical settings.
Keywords: Head and neck squamous cell carcinoma; Histology; Human papillomavirus; Self-supervised learning.
Copyright © 2025. Published by Elsevier Ltd.