Objectives: To retrospectively evaluate the performance of a CE-marked AI system for identifying breast cancer on screening mammograms. Evidence from large retrospective studies is crucial for planning prospective studies and to further ensure safe implementation.
Materials and methods: We used data from screening examinations performed from 2004 to 2021 at ten breast centers in BreastScreen Norway. In the standard independent double reading setting, each radiologist scored each breast from 1 (negative) to 5 (high probability of cancer). The AI system assigned each examination an NT and an SN score; the NT score aimed to classify examinations as negative with minimal misclassification while the SN score aimed to classify examinations as positive with high confidence. N70 was defined as being among the 70% with the lowest NT score and P3 was defined as being among the 3% with the highest SN score.
Results: A total of 1,017,208 screening examinations were included in the study sample. At N70, 1.8% (107/5977) of the screen-detected and 34.5% (625/1812) of the interval cancers were defined as negative. Using P3 to define cases as positive, 81.5% (4871/5977) of the screen-detected and 19.0% (344/1812) of the interval cancers were defined as positive. Among the screen-detected cancers in N70, 11.2% (12/107) had an interpretation score > 2 by both radiologists.
Conclusion: The AI system performed well according to identifying negative cases and cancer cases. Thus, the AI system can be used to reduce workload for the radiologists and potentially increase the sensitivity of mammography.
Key points: Question Results from large mammography screening samples not used in training AI algorithms are important to consider when planning prospective studies and implementation. Findings More than 80% of the screening-detected cancers were classified as positive by AI when considering 3% of the examinations with the highest AI risk score as positive. Clinical relevance A lack of radiologists is a challenge in mammographic screening. Our findings support other studies that suggest the use of AI to reduce screen-reading workload.
Keywords: Artificial intelligence; Breast cancer; Mammography; Screening.
© 2025. The Author(s).