Objective: The aim of this study was to develop quantitative feature-based models from histopathological images to distinguish hepatocellular carcinoma (HCC) from adjacent normal tissue and predict the prognosis of HCC patients after surgical resection.
Methods: A fully automated pipeline was constructed using computational approaches to analyze the quantitative features of histopathological slides of HCC patients, in which the features were extracted from the hematoxylin and eosin (H&E)-stained whole-slide images of HCC patients from The Cancer Genome Atlas and tissue microarray images from West China Hospital. The extracted features were used to train the statistical models that classify tissue slides and predict patients' survival outcomes by machine-learning methods.
Results: A total of 1733 quantitative image features were extracted from each histopathological slide. The diagnostic classifier based on 31 features was able to successfully distinguish HCC from adjacent normal tissues in both the test [area under the receiver operating characteristic curve (AUC) 0.988] and external validation sets (AUC 0.886). The random-forest prognostic model using 46 features was able to significantly stratify patients in each set into longer- or shorter-term survival groups according to their assigned risk scores. Moreover, the prognostic model we constructed showed comparable predicting accuracy as TNM staging systems in predicting patients' survival at different time points after surgery.
Conclusions: Our findings suggest that machine-learning models derived from image features can assist clinicians in HCC diagnosis and its prognosis prediction after hepatectomy.