Cross-convolutional transformer for automated multi-organs segmentation in a variety of medical images

Jing Wang; Haiyue Zhao; Wei Liang; Shuyu Wang; Yan Zhang

doi:10.1088/1361-6560/acb19a

Cross-convolutional transformer for automated multi-organs segmentation in a variety of medical images

Phys Med Biol. 2023 Jan 23;68(3). doi: 10.1088/1361-6560/acb19a.

Authors

Jing Wang¹, Haiyue Zhao², Wei Liang³, Shuyu Wang², Yan Zhang⁴

Affiliations

¹ School of Information Science and Engineering Department, Shandong University, 72 Binghai Road, Jimo, Qingdao, Shandong, People's Republic of China.
² Shandong Youth University of Political Science, No.31699 Jing Shi East Road, Li Xia District, Jinan, Shandong, People's Republic of China.
³ Department of ecological environment of Shandong, No.3377 Jing Shi Road, Jinan, People's Republic of China.
⁴ Shandong Mental Health Center, No.49 Wen Hua Dong Road, Li Xia District, Jinan, Shandong, People's Republic of China.

PMID: 36623323
DOI: 10.1088/1361-6560/acb19a

Abstract

Objective.It is a huge challenge for multi-organs segmentation in various medical images based on a consistent algorithm with the development of deep learning methods. We therefore develop a deep learning method based on cross-convolutional transformer for these automated- segmentation to obtain better generalization and accuracy.Approach.We propose a cross-convolutional transformer network (C²Former) to solve the segmentation problem. Specifically, we first redesign a novel cross-convolutional self-attention mechanism in terms of the algorithm to integrate local and global contexts and model long-distance and short-distance dependencies to enhance the semantic feature understanding of images. Then multi-scale feature edge fusion module is proposed to combine the image edge features, which effectively form multi-scale feature streams and establish reliable relational connections in the global context. Finally, we use three different modalities, imaging three different anatomical regions to train and test multi organs and evaluate segmentation performance.Main results.We use the evaluation metrics of Dice similarity coefficient (DSC) and 95% Hausdorff distance (HD95) for each dataset. Experiments showed the average DSC of 83.22% and HD95 of 17.55 mm on the Synapse dataset (CT images of abdominal multi-organ), the average DSC of 91.42% and HD95 of 1.06 mm on the ACDC dataset (MRI of cardiac substructures) and the average DSC of 86.78% and HD95 of 16.85 mm on the ISIC 2017 dataset (skin cancer images). In each dataset, our proposed method consistently outperforms the compared networks.Significance.The proposed deep learning network provides a generalized and accurate solution method for multi-organ segmentation in the three different datasets. It has the potential to be applied to a variety of medical datasets for structural segmentation.

Keywords: deep learning; medical image; self-attention; transformer; visual attention mechanism.

Creative Commons Attribution license.

MeSH terms

Algorithms
Humans
Image Processing, Computer-Assisted / methods
Magnetic Resonance Imaging
Neural Networks, Computer*
Skin Neoplasms*