Predicting Breast Cancer by Applying Deep Learning to Linked Health Records and Mammograms

Radiology. 2019 Aug;292(2):331-342. doi: 10.1148/radiol.2019182622. Epub 2019 Jun 18.


Background Computational models on the basis of deep neural networks are increasingly used to analyze health care data. However, the efficacy of traditional computational models in radiology is a matter of debate. Purpose To evaluate the accuracy and efficiency of a combined machine and deep learning approach for early breast cancer detection applied to a linked set of digital mammography images and electronic health records. Materials and Methods In this retrospective study, 52 936 images were collected in 13 234 women who underwent at least one mammogram between 2013 and 2017, and who had health records for at least 1 year before undergoing mammography. The algorithm was trained on 9611 mammograms and health records of women to make two breast cancer predictions: to predict biopsy malignancy and to differentiate normal from abnormal screening examinations. The study estimated the association of features with outcomes by using t test and Fisher exact test. The model comparisons were performed with a 95% confidence interval (CI) or by using the DeLong test. Results The resulting algorithm was validated in 1055 women and tested in 2548 women (mean age, 55 years ± 10 [standard deviation]). In the test set, the algorithm identified 34 of 71 (48%) false-negative findings on mammograms. For the malignancy prediction objective, the algorithm obtained an area under the receiver operating characteristic curve (AUC) of 0.91 (95% CI: 0.89, 0.93), with specificity of 77.3% (95% CI: 69.2%, 85.4%) at a sensitivity of 87%. When trained on clinical data alone, the model performed significantly better than the Gail model (AUC, 0.78 vs 0.54, respectively; P < .004). Conclusion The algorithm, which combined machine-learning and deep-learning approaches, can be applied to assess breast cancer at a level comparable to radiologists and has the potential to substantially reduce missed diagnoses of breast cancer. © RSNA, 2019 Online supplemental material is available for this article.

MeSH terms

  • Breast / diagnostic imaging
  • Breast Neoplasms / diagnostic imaging*
  • Deep Learning*
  • Electronic Health Records*
  • Female
  • Humans
  • Mammography / methods*
  • Middle Aged
  • Predictive Value of Tests
  • Radiographic Image Interpretation, Computer-Assisted / methods*
  • Reproducibility of Results
  • Retrospective Studies
  • Sensitivity and Specificity