A novel transfer-learning based physician-level general and subtype classifier for non-small cell lung cancer

Heliyon. 2022 Nov 29;8(12):e11981. doi: 10.1016/j.heliyon.2022.e11981. eCollection 2022 Dec.


Confirming histological patterns of lung carcinoma is important for determining the prognosis and the next steps of treatment for a patient. Confirming the histologic patterns (subtype) of lung adenocarcinoma is important for determining the prognosis and treatment options for a patient. The task is challenging, and often requires the input of experienced pathologists, who by themselves lack interobserver concordance. A computer-aided diagnosis holds the potential to accelerate the time to diagnosis. As many adenocarcinoma tissue samples contain multiple histologic patterns, accurate computer-aided diagnosis requires annotations manually labeled by pathologists. We propose a method that merges weak supervised learning and Integrated Learning using Transfer Learning using two datasets: The Cancer Genome Atlas (TCGA), and the Clinical Proteomic Tumor Analysis Consortium (CPTAC) to reduce the need for manual annotation by a pathologist while maintaining accuracy. Whole-slide images (WSI) are first determined to be either adenocarcinoma or squamous cell carcinoma, then further identify the subtypes by generating weak classifiers for each subtype, then using integrated learning to create a strong classifier. Our model was evaluated with independent datasets from the CPTAC dataset and a dataset from a private hospital. It can achieve AUC values of 0.86, 0.91, 0.82, 0.77, 0.96, 0.98 in Acinar, LPA, Micropapillary, Papillary, Solid, and Normal, respectively.

Keywords: Adenocarcinoma subtype classification; Lung adenocarcinoma; NSCLC; Squamous cell carcinoma; The cancer genome atlas; Transfer learning; Weak supervised learning.