ADR-Net: Context extraction network based on M-Net for medical image segmentation

Med Phys. 2020 Sep;47(9):4254-4264. doi: 10.1002/mp.14364. Epub 2020 Aug 2.

Abstract

Purpose: Medical image segmentation is an essential component of medical image analysis. Accurate segmentation can assist doctors in diagnosis and relieve their fatigue. Although several image segmentation methods based on U-Net have been proposed, their performances have been observed to be suboptimal in the case of small-sized objects. To address this shortcoming, a novel network architecture is proposed in this study to enhance segmentation performance on small medical targets.

Methods: In this paper, we propose a joint multi-scale context attention network architecture to simultaneously capture higher level semantic information and spatial information. In order to obtain a greater number of feature maps during decoding, the network concatenates the images of side inputs by down-sampling during the encoding phase. In the bottleneck layer of the network, dense atrous convolution (DAC) and multi-scale residual pyramid pooling (RMP) modules are exploited to better capture high-level semantic information and spatial information. To improve the segmentation performance on small targets, the attention gate (AG) block is used to effectively suppress feature activation in uncorrelated regions and highlight the target area.

Results: The proposed model is first evaluated on the public dataset DRIVE, on which it performs significantly better than the basic framework in terms of sensitivity (SE), intersection-over-union (IOU), and area under the receiver operating characteristic curve (AUC). In particular, the SE and IOU are observed to increase by 7.46% and 5.97%, respectively. Further, the evaluation indices exhibit improvements compared to those of state-of-the-art methods as well, with SE and IOU increasing by 3.58% and 3.26%, respectively. Additionally, in order to demonstrate the generalizability of the proposed architecture, we evaluate our model on three other challenging datasets. The respective performances are observed to be better than those of state-of-the-art network architectures on the same datasets. Moreover, we use lung segmentation as a comparative experiment to demonstrate the transferability of the advantageous properties of the proposed approach in the context of small target segmentation to the segmentation of large targets. Finally, an ablation study is conducted to investigate the individual contributions of the AG block, the DAC block, and the RMP block to the performance of the network.

Conclusions: The proposed method is evaluated on various datasets. Experimental results demonstrate that the proposed model performs better than state-of-the-art methods in medical image segmentation of small targets.

Keywords: attention gate mechanism; deep Learning; dilation convolution; medical image segmentation; residual; spatial pyramid pooling.

MeSH terms

  • Image Processing, Computer-Assisted*
  • Lung
  • Neural Networks, Computer*