Rationale and objectives: Studies that evaluate the lung nodule detection performance of radiologists or computerized methods depend on an initial inventory of the nodules within the thoracic images (the "truth"). The purpose of this study was to analyze (1) variability in the "truth" defined by different combinations of experienced thoracic radiologists and (2) variability in the performance of other experienced thoracic radiologists based on these definitions of "truth" in the context of lung nodule detection in computed tomographic (CT) scans.
Materials and methods: Twenty-five thoracic CT scans were reviewed by four thoracic radiologists, who independently marked lesions they considered to be nodules >or=3 mm in maximum diameter. Panel "truth" sets of nodules were then derived from the nodules marked by different combinations of two and three of these four radiologists. The nodule detection performance of the other radiologists was evaluated based on these panel "truth" sets.
Results: The number of "true" nodules in the different panel "truth" sets ranged from 15 to 89 (mean 49.8 +/- 25.6). The mean radiologist nodule detection sensitivities across radiologists and panel "truth" sets for different panel "truth" conditions ranged from 51.0 to 83.2%; mean false-positive rates ranged from 0.33 to 1.39 per case.
Conclusions: Substantial variability exists across radiologists in the task of lung nodule identification in CT scans. The definition of "truth" on which lung nodule detection studies are based must be carefully considered, because even experienced thoracic radiologists may not perform well when measured against the "truth" established by other experienced thoracic radiologists.