External Evaluation of a Mammography-based Deep Learning Model for Predicting Breast Cancer in an Ethnically Diverse Population

Olasubomi J Omoleye; Anna E Woodard; Frederick M Howard; Fangyuan Zhao; Toshio F Yoshimatsu; Yonglan Zheng; Alexander T Pearson; Maksim Levental; Benjamin S Aribisala; Kirti Kulkarni; Gregory S Karczmar; Olufunmilayo I Olopade; Hiroyuki Abe; Dezheng Huo

doi:10.1148/ryai.220299

External Evaluation of a Mammography-based Deep Learning Model for Predicting Breast Cancer in an Ethnically Diverse Population

Radiol Artif Intell. 2023 Jul 26;5(6):e220299. doi: 10.1148/ryai.220299. eCollection 2023 Nov.

Affiliation

¹ From the Center for Clinical Cancer Genetics and Global Health, Department of Medicine (O.J.O., A.E.W., T.F.Y., Y.Z., B.S.A., O.I.O.), Data Science Institute (A.E.W.), Division of Hematology/Oncology, Department of Medicine (F.M.H., A.T.P.), Department of Public Health Sciences (F.Z., D.H.), Department of Computer Science (M.L.), and Department of Radiology (K.K., G.S.K., H.A.), The University of Chicago, 5841 S Maryland Ave, MC 2000, Chicago, IL 60637; Department of Computer Science, Lagos State University, Lagos, Nigeria (B.S.A.).

^# Contributed equally.

Abstract

Purpose: To externally evaluate a mammography-based deep learning (DL) model (Mirai) in a high-risk racially diverse population and compare its performance with other mammographic measures.

Materials and methods: A total of 6435 screening mammograms in 2096 female patients (median age, 56.4 years ± 11.2 [SD]) enrolled in a hospital-based case-control study from 2006 to 2020 were retrospectively evaluated. Pathologically confirmed breast cancer was the primary outcome. Mirai scores were the primary predictors. Breast density and Breast Imaging Reporting and Data System (BI-RADS) assessment categories were comparative predictors. Performance was evaluated using area under the receiver operating characteristic curve (AUC) and concordance index analyses.

Results: Mirai achieved 1- and 5-year AUCs of 0.71 (95% CI: 0.68, 0.74) and 0.65 (95% CI: 0.64, 0.67), respectively. One-year AUCs for nondense versus dense breasts were 0.72 versus 0.58 (P = .10). There was no evidence of a difference in near-term discrimination performance between BI-RADS and Mirai (1-year AUC, 0.73 vs 0.68; P = .34). For longer-term prediction (2-5 years), Mirai outperformed BI-RADS assessment (5-year AUC, 0.63 vs 0.54; P < .001). Using only images of the unaffected breast reduced the discriminatory performance of the DL model (P < .001 at all time points), suggesting that its predictions are likely dependent on the detection of ipsilateral premalignant patterns.

Conclusion: A mammography DL model showed good performance in a high-risk external dataset enriched for African American patients, benign breast disease, and BRCA mutation carriers, and study findings suggest that the model performance is likely driven by the detection of precancerous changes.Keywords: Breast, Cancer, Computer Applications, Convolutional Neural Network, Deep Learning Algorithms, Informatics, Epidemiology, Machine Learning, Mammography, Oncology, Radiomics Supplemental material is available for this article. © RSNA, 2023See also commentary by Kontos and Kalpathy-Cramer in this issue.

Keywords: Breast; Cancer; Computer Applications; Convolutional Neural Network; Deep Learning Algorithms; Epidemiology; Informatics; Machine Learning; Mammography; Oncology; Radiomics.

External Evaluation of a Mammography-based Deep Learning Model for Predicting Breast Cancer in an Ethnically Diverse Population

Authors

Affiliation

Abstract

Grants and funding