A deep learning method for classifying mammographic breast density categories

Aly A Mohamed; Wendie A Berg; Hong Peng; Yahong Luo; Rachel C Jankowitz; Shandong Wu

doi:10.1002/mp.12683

A deep learning method for classifying mammographic breast density categories

Med Phys. 2018 Jan;45(1):314-321. doi: 10.1002/mp.12683. Epub 2017 Dec 22.

Authors

Aly A Mohamed¹, Wendie A Berg^{1

2}, Hong Peng³, Yahong Luo⁴, Rachel C Jankowitz^{2

5}, Shandong Wu⁶

Affiliations

¹ Department of Radiology, University of Pittsburgh School of Medicine, 4200 Fifth Ave, Pittsburgh, PA, 15260, USA.
² Magee-Womens Hospital of University of Pittsburgh Medical Center, 300 Halket St, Pittsburgh, PA, 15213, USA.
³ Department of Radiology, Chinese PLA General Hospital, 28 Fuxing Rd, Haidian District, Beijing, 100853, China.
⁴ Department of Radiology, Liaoning Cancer Hospital & Institute, 44 Xiaoheyan Rd, Dadong District, Shenyang City, Liaoning, 110042, China.
⁵ Department of Medicine, School of Medicine, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA, 15260, USA.
⁶ Departments of Radiology, Biomedical Informatics, Bioengineering, and Computer Science, University of Pittsburgh, 4200 Fifth Ave, Pittsburgh, PA, 15260, USA.

Abstract

Purpose: Mammographic breast density is an established risk marker for breast cancer and is visually assessed by radiologists in routine mammogram image reading, using four qualitative Breast Imaging and Reporting Data System (BI-RADS) breast density categories. It is particularly difficult for radiologists to consistently distinguish the two most common and most variably assigned BI-RADS categories, i.e., "scattered density" and "heterogeneously dense". The aim of this work was to investigate a deep learning-based breast density classifier to consistently distinguish these two categories, aiming at providing a potential computerized tool to assist radiologists in assigning a BI-RADS category in current clinical workflow.

Methods: In this study, we constructed a convolutional neural network (CNN)-based model coupled with a large (i.e., 22,000 images) digital mammogram imaging dataset to evaluate the classification performance between the two aforementioned breast density categories. All images were collected from a cohort of 1,427 women who underwent standard digital mammography screening from 2005 to 2016 at our institution. The truths of the density categories were based on standard clinical assessment made by board-certified breast imaging radiologists. Effects of direct training from scratch solely using digital mammogram images and transfer learning of a pretrained model on a large nonmedical imaging dataset were evaluated for the specific task of breast density classification. In order to measure the classification performance, the CNN classifier was also tested on a refined version of the mammogram image dataset by removing some potentially inaccurately labeled images. Receiver operating characteristic (ROC) curves and the area under the curve (AUC) were used to measure the accuracy of the classifier.

Results: The AUC was 0.9421 when the CNN-model was trained from scratch on our own mammogram images, and the accuracy increased gradually along with an increased size of training samples. Using the pretrained model followed by a fine-tuning process with as few as 500 mammogram images led to an AUC of 0.9265. After removing the potentially inaccurately labeled images, AUC was increased to 0.9882 and 0.9857 for without and with the pretrained model, respectively, both significantly higher (P < 0.001) than when using the full imaging dataset.

Conclusions: Our study demonstrated high classification accuracies between two difficult to distinguish breast density categories that are routinely assessed by radiologists. We anticipate that our approach will help enhance current clinical assessment of breast density and better support consistent density notification to patients in breast cancer screening.

Keywords: BI-RADS; breast density; convolutional neural network (CNN); deep learning; digital mammography; transfer learning.

MeSH terms

Area Under Curve
Female
Humans
Image Interpretation, Computer-Assisted / methods*
Machine Learning*
Mammography* / methods
Neural Networks, Computer*
ROC Curve
Retrospective Studies

Abstract

MeSH terms

Grants and funding