Artificial intelligence-assisted double reading of chest radiographs to detect clinically relevant missed findings: a two-centre evaluation

Eur Radiol. 2024 Mar 11. doi: 10.1007/s00330-024-10676-w. Online ahead of print.

Abstract

Objectives: To evaluate an artificial intelligence (AI)-assisted double reading system for detecting clinically relevant missed findings on routinely reported chest radiographs.

Methods: A retrospective study was performed in two institutions, a secondary care hospital and tertiary referral oncology centre. Commercially available AI software performed a comparative analysis of chest radiographs and radiologists' authorised reports using a deep learning and natural language processing algorithm, respectively. The AI-detected discrepant findings between images and reports were assessed for clinical relevance by an external radiologist, as part of the commercial service provided by the AI vendor. The selected missed findings were subsequently returned to the institution's radiologist for final review.

Results: In total, 25,104 chest radiographs of 21,039 patients (mean age 61.1 years ± 16.2 [SD]; 10,436 men) were included. The AI software detected discrepancies between imaging and reports in 21.1% (5289 of 25,104). After review by the external radiologist, 0.9% (47 of 5289) of cases were deemed to contain clinically relevant missed findings. The institution's radiologists confirmed 35 of 47 missed findings (74.5%) as clinically relevant (0.1% of all cases). Missed findings consisted of lung nodules (71.4%, 25 of 35), pneumothoraces (17.1%, 6 of 35) and consolidations (11.4%, 4 of 35).

Conclusion: The AI-assisted double reading system was able to identify missed findings on chest radiographs after report authorisation. The approach required an external radiologist to review the AI-detected discrepancies. The number of clinically relevant missed findings by radiologists was very low.

Clinical relevance statement: The AI-assisted double reader workflow was shown to detect diagnostic errors and could be applied as a quality assurance tool. Although clinically relevant missed findings were rare, there is potential impact given the common use of chest radiography.

Key points: • A commercially available double reading system supported by artificial intelligence was evaluated to detect reporting errors in chest radiographs (n=25,104) from two institutions. • Clinically relevant missed findings were found in 0.1% of chest radiographs and consisted of unreported lung nodules, pneumothoraces and consolidations. • Applying AI software as a secondary reader after report authorisation can assist in reducing diagnostic errors without interrupting the radiologist's reading workflow. However, the number of AI-detected discrepancies was considerable and required review by a radiologist to assess their relevance.

Keywords: Artificial intelligence; Diagnostic errors; Healthcare quality assurance; Natural language processing; Thoracic radiography.