Diagnostic performance of a deep learning-based method in differentiating malignant from benign subcentimeter (≤10 mm) solid pulmonary nodules

J Thorac Dis. 2023 Oct 31;15(10):5475-5484. doi: 10.21037/jtd-23-985. Epub 2023 Sep 19.

Abstract

Background: This study assessed the diagnostic performance of a deep learning (DL)-based model for differentiating malignant subcentimeter (≤10 mm) solid pulmonary nodules (SSPNs) from benign ones in computed tomography (CT) images compared against radiologists with 10 and 15 years of experience in thoracic imaging (medium-senior seniority).

Methods: Overall, 200 SSPNs (100 benign and 100 malignant) were retrospectively collected. Malignancy was confirmed by pathology, and benignity was confirmed by follow-up or pathology. CT images were fed into the DL model to obtain the probability of malignancy (range, 0-100%) for each nodule. According to the diagnostic results, enrolled nodules were classified into benign, malignant, or indeterminate. The accuracy and diagnostic composition of the model were compared with those of the radiologists using the McNemar-Bowker test. Enrolled nodules were divided into 3-6-, 6-8-, and 8-10-mm subgroups. For each subgroup, the diagnostic results of the model were compared with those of the radiologists.

Results: The accuracy of the DL model, in differentiating malignant and benign SSPNs, was significantly higher than that of the radiologists (71.5% vs. 38.5%, P<0.001). The DL model reported more benign or malignant deterministic results and fewer indeterminate results. In subgroup analysis of nodule size, the DL model also yielded higher performance in comparison with that of the radiologists, providing fewer indeterminate results. The accuracy of the two methods in the 3-6-, 6-8-, and 8-10-mm subgroups was 75.5% vs. 28.3% (P<0.001), 62.0% vs. 28.2% (P<0.001), and 77.6% vs. 55.3% (P=0.001), respectively, and the indeterminate results were 3.8% vs. 66.0%, 8.5% vs. 66.2%, and 2.6% vs. 35.5% (all P<0.001), respectively.

Conclusions: The DL-based method yielded higher performance in comparison with that of the radiologists in differentiating malignant and benign SSPNs. This DL model may reduce uncertainty in diagnosis and improve diagnostic accuracy, especially for SSPNs smaller than 8 mm.

Keywords: Computed tomography (CT); artificial intelligence (AI); deep learning (DL); differential diagnosis; solitary pulmonary nodule.