Purpose: To measure reader variability related to the evaluation of screening chest radiographs (CXRs) for findings of primary lung cancer.
Materials and methods: From the National Lung Screening Trial (NLST), 100 cases were randomly selected from baseline CXR examinations for retrospective interpretation by 9 NLST radiologists; images with noncalcified lung nodules (NCNs) or other abnormalities suspicious for lung cancer as determined by the original NLST reader were oversampled. Agreement on the presence of pulmonary nodules and abnormalities suspicious for cancer and recommendations for follow-up were assessed by the multirater κ statistic.
Results: The multirater κ statistic for interreader agreement on the presence of at least 1 NCN was 0.38. Rates at which readers reported the presence of at least 1 NCN ranged from 32% to 63% (mean, 41%); among 16 subjects with NCN and a cancer diagnosis within 1 year of the CXR examination, an average of 87% (range, 81% to 94%) of cases were classified as suspicious for cancer across all readers. The multirater κ for agreement on follow-up recommendations was 0.34; pairwise κ values ranged from 0.15 to 0.64 (mean, 0.36). For all subjects, readers recommended a follow-up procedure classified as high level (computed tomography, fluorodeoxyglucose-positron emission tomography, or biopsy) 42% of the time on average (range, 30% to 67%); this increased to 84% (range, 52% to 100%) when readers reported an NCN and 88% (range, 82% to 94%) for subjects with cancer.
Conclusion: Reader agreement for screening CXR interpretation and follow-up recommendations is fair overall but is high for malignant lesions.
Trial registration: ClinicalTrials.gov NCT00047385.