Purpose: To quantify the reproducibility and accuracy of experienced thoracic radiologists in differentiating between subsolid and solid pulmonary nodules at CT.
Materials and methods: The institutional review board of Beth Israel Deaconess Medical Center approved this multicenter study. Six thoracic radiologists, with a mean of 21 years of experience in thoracic radiology (range, 17-22 years), selected images of 10 solid and 10 subsolid nodules to create a database of 120 nodules; this selection served as the reference standard. Each radiologist then interpreted 120 randomly ordered nodules in two different sessions that were separated by a minimum of 3 weeks. The radiologists classified whether or not each nodule was subsolid. Inter- and intraobserver agreement was assessed with a κ statistic. The number of correct classifications was calculated and correlated with nodule size by using Bland-Altman plots. The relationship between disagreement and nodule morphologic characteristics was analyzed by calculating the intraclass correlation coefficient.
Results: Interobserver agreement (κ) was 0.619 (range, 0.469-0.745; 95% confidence interval (CI): 0.576, 0.663) and 0.670 (range, 0.440-0.839; 95% CI: 0.608, 0.733) for interpretation sessions 1 and 2, respectively. Intraobserver agreement (κ) was 0.792 (95% CI: 0.750, 0.833). Averaged for interpretation sessions, correct classification was achieved by all radiologists for 58% (70 of 120) of nodules. Radiologists agreed with their initial determination (the reference standard) in 77% of cases (range, 45%-100%). Nodule size weakly correlated with correct classification (long axis: Spearman rank correlation coefficient, rs = 0.161 and P = .049; short axis: rs = 0.128 and P = .163).
Conclusion: The reproducibility and accuracy of thoracic radiologists in classifying whether or not a nodule is subsolid varied in the retrospective study. This inconsistency may affect surveillance recommendations and prognostic determinations.