Background: DNAs released from tumor cells into blood (circulating tumor DNAs, ctDNAs) carry tumor-specific genomic aberrations, providing a non-invasive means for cancer detection. In this study, we aimed to leverage somatic copy number aberration (SCNA) in ctDNA to develop assays to detect early-stage HCCs.
Methods: We conducted low-depth whole-genome sequencing (WGS) to profile SCNAs in 384 plasma samples of hepatitis B virus (HBV)-related HCC and cancer-free HBV patients, using one discovery and two validation cohorts. To fully capture the robust signals of WGS data from the complete genome, we developed a machine learning-based statistical model that is focused on detection accuracy in early-stage HCC.
Findings: We built the model using a discovery cohort of 209 patients, achieving an overall area under curve (AUC) of 0.893, with 0.874 for early-stage (Barcelona clinical liver cancer [BCLC] stage 0-A) and 0.933 for advanced-stage (BCLC stage B-D). The performance of the model was then assessed in two validation cohorts (76 and 99 patients) that only consisted of patients with stage 0-A HCC. Our model exhibited a robust predictive performance, with an AUC of 0.920 and 0.812 for the two validation cohorts. Further analyses showed the impact of tumor sample heterogeneity in model training on detecting early-stage tumors, and a refined model addressing the heterogeneity in the discovery cohort significantly increased model performance in validation.
Interpretation: We developed an SCNA-based, machine learning-driven model in the non-invasive detection of early-stage HCC in HBV patients and demonstrated its performance through strict independent validations.
Keywords: Copy number aberration (CNA); Early detection; Hepatocellular carcinoma (HCC); Machine learning.
Copyright © 2020 The Authors. Published by Elsevier B.V. All rights reserved.