EPT-Net: Edge Perception Transformer for 3D Medical Image Segmentation

IEEE Trans Med Imaging. 2023 Nov;42(11):3229-3243. doi: 10.1109/TMI.2023.3278461. Epub 2023 Oct 27.

Abstract

The convolutional neural network has achieved remarkable results in most medical image seg- mentation applications. However, the intrinsic locality of convolution operation has limitations in modeling the long-range dependency. Although the Transformer designed for sequence-to-sequence global prediction was born to solve this problem, it may lead to limited positioning capability due to insufficient low-level detail features. Moreover, low-level features have rich fine-grained information, which greatly impacts edge segmentation decisions of different organs. However, a simple CNN module is difficult to capture the edge information in fine-grained features, and the computational power and memory consumed in processing high-resolution 3D features are costly. This paper proposes an encoder-decoder network that effectively combines edge perception and Transformer structure to segment medical images accurately, called EPT-Net. Under this framework, this paper proposes a Dual Position Transformer to enhance the 3D spatial positioning ability effectively. In addition, as low-level features contain detailed information, we conduct an Edge Weight Guidance module to extract edge information by minimizing the edge information function without adding network parameters. Furthermore, we verified the effectiveness of the proposed method on three datasets, including SegTHOR 2019, Multi-Atlas Labeling Beyond the Cranial Vault and the re-labeled KiTS19 dataset called KiTS19-M by us. The experimental results show that EPT-Net has significantly improved compared with the state-of-the-art medical image segmentation method.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Image Processing, Computer-Assisted
  • Neural Networks, Computer*
  • Perception
  • Skull*