Shift-channel attention and weighted-region loss function for liver and dense tumor segmentation

Med Phys. 2022 Nov;49(11):7193-7206. doi: 10.1002/mp.15816. Epub 2022 Jul 6.

Abstract

Purpose: To assist physicians in the diagnosis and treatment planning of tumor, a robust and automatic liver and tumor segmentation method is highly demanded in the clinical practice. Recently, numerous researchers have improved the segmentation accuracy of liver and tumor by introducing multiscale contextual information and attention mechanism. However, this tends to introduce more training parameters and suffer from a heavier computational burden. In addition, the tumor has various sizes, shapes, locations, and numbers, which is the main reason for the poor accuracy of automatic segmentation. Although current loss functions can improve the learning ability of the model for hard samples to a certain extent, these loss functions are difficult to optimize the segmentation effect of small tumor regions when the large tumor regions in the sample are in the majority.

Methods: We propose a Liver and Tumor Segmentation Network (LiTS-Net) framework. First, the Shift-Channel Attention Module (S-CAM) is designed to model the feature interdependencies in adjacent channels and does not require additional training parameters. Second, the Weighted-Region (WR) loss function is proposed to emphasize the weight of small tumors in dense tumor regions and reduce the weight of easily segmented samples. Moreover, the Multiple 3D Inception Encoder Units (MEU) is adopted to capture the multiscale contextual information for better segmentation of liver and tumor.

Results: Efficacy of the LiTS-Net is demonstrated through the public dataset of MICCAI 2017 Liver Tumor Segmentation (LiTS) challenge, with Dice per case of 96.9 % ${\bf \%}$ and 75.1 % ${\bf \%}$ , respectively. For the 3D Image Reconstruction for Comparison of Algorithm and DataBase (3Dircadb), Dices are 96.47 % ${\bf \%}$ for the liver and 74.54 % ${\bf \%}$ for tumor segmentation. The proposed LiTS-Net outperforms existing state-of-the-art networks.

Conclusions: We demonstrated the effectiveness of LiTS-Net and its core components for liver and tumor segmentation. The S-CAM is designed to model the feature interdependencies in the adjacent channels, which is characterized by no need to add additional training parameters. Meanwhile, we conduct an in-depth study of the feature shift proportion of adjacent channels to determine the optimal shift proportion. In addition, the WR loss function can implicitly learn the weights among regions without the need to manually specify the weights. In dense tumor segmentation tasks, WR aims to enhance the weights of small tumor regions and alleviate the problem that small tumor segmentation is difficult to optimize further when large tumor regions occupy the majority. Last but not least, our proposed method outperforms other state-of-the-art methods on both the LiTS dataset and the 3Dircadb dataset.

Keywords: Weighted-Region loss function; class imbalance learning; multiscale contextual information; shift-channel attention.

MeSH terms

  • Humans
  • Liver* / diagnostic imaging
  • Neoplasms*