Background and objective: Human epidermal growth factor receptor 2 (HER2) protein overexpression is one of the most significant biomarkers for breast cancer diagnostics, treatment prediction, and prognostics. The high accessibility of HER2 inhibitors in routine clinical practice directly translates into the diagnostic need for precise and robust marker identification. Even though multigene next-generation sequencing methodologies have slowly taken over the field of single-biomarker molecular tests, the copy number alterations such as amplification of the HER2-coding ERBB2 gene are hard to validate on next-generation sequencing platforms as they are characterized by chromosomal structural heterogeneity, polysomy, and genomic context of ploidy. In our study, we tested the approach of using whole genome sequencing instead of next-generation sequencing panels to determine HER2 status in the clinical set-up.
Methods: We used a large dataset of 876 patients with breast cancer whole genomes with curated clinical data and an additional set of 551 patients' external genomic data. We used the decision-tree-based algorithm for optimization of the diagnostic tool for HER2 status assessment by whole genome sequencing.
Results: The most efficient approach to assess HER2 status in whole genome sequencing data was the ploidy-corrected copy number, utilizing ERBB2 copy number and mean tumor ploidy. The classifier achieved sensitivity of 91.18% and specificity of 98.69% on the internal validation dataset and 89.86% and 96.06% on the external data, which is similar to other next-generation sequencing methods, currently tested in the clinic.
Conclusions: We provide evidence that the HER2 status may be reliably determined by whole genome sequencing and is applicable across different laboratory protocols and pipelines. We suggest using the ploidy-corrected copy number for diagnostic purposes.
© 2021. The Author(s).