Differential Data Augmentation Techniques for Medical Imaging Classification Tasks

AMIA Annu Symp Proc. 2018 Apr 16:2017:979-984. eCollection 2017.


Data augmentation is an essential part of training discriminative Convolutional Neural Networks (CNNs). A variety of augmentation strategies, including horizontal flips, random crops, and principal component analysis (PCA), have been proposed and shown to capture important characteristics of natural images. However, while data augmentation has been commonly used for deep learning in medical imaging, little work has been done to determine which augmentation strategies best capture medical image statistics, leading to more discriminative models. This work compares augmentation strategies and shows that the extent to which an augmented training set retains properties of the original medical images determines model performance. Specifically, augmentation strategies such as flips and gaussian filters lead to validation accuracies of 84% and 88%, respectively. On the other hand, a less effective strategy such as adding noise leads to a significantly worse validation accuracy of 66%. Finally, we show that the augmentation affects mass generation.

Publication types

  • Comparative Study

MeSH terms

  • Data Visualization
  • Datasets as Topic
  • Deep Learning*
  • Diagnostic Imaging
  • Humans
  • Image Enhancement / methods*
  • Mammography / classification*
  • Neural Networks, Computer*
  • Radiology Information Systems