Purpose: In recent years, the continuous advancement of convolutional neural networks (CNNs) has led to the widespread integration of deep neural networks as a mainstream approach in clinical diagnostic support. Particularly, the utilization of CNN-based medical image segmentation has delivered favorable outcomes for aiding clinical diagnosis. Within this realm, network architectures based on the U-shaped structure and incorporating skip connections, along with their diverse derivatives, have gained extensive utilization across various medical image segmentation tasks. Nonetheless, two primary challenges persist. First, certain organs or tissues present considerable complexity, substantial morphological variations, and size discrepancies, posing significant challenges for achieving highly accurate segmentation. Second, the predominant focus of current deep neural networks on single-resolution feature extraction limits the effective extraction of feature information from complex medical images, thereby contributing to information loss via continuous pooling operations and contextual information interaction constraints within the U-shaped structure.
Approach: We proposed a five-layer pyramid segmentation network (PS5-Net), a multiscale segmentation network with diverse resolutions that is founded on the U-Net architecture. Initially, this network effectively leverages the distinct features of images at varying resolutions across different dimensions, departing from prior single-resolution feature extraction methods to adapt to intricate and variable segmentation scenarios. Subsequently, to comprehensively integrate feature information from diverse resolutions, a kernel selection module is proposed to assign weights to features across different dimensions, enhancing the fusion of feature information from various resolutions. Within the feature extraction network denoted as PS-UNet, we preserve the classical structure of the traditional U-Net while enhancing it through the incorporation of dilated convolutions.
Results: PS5-Net attains a Dice score of 0.9613 for liver segmentation on the CHLISC dataset and 0.8587 on the ISIC2018 dataset for skin lesion segmentation. Comparative analysis with diverse medical image segmentation methodologies in recent years reveals that PS5-Net has achieved the highest scores and substantial advancements.
Conclusions: PS5-Net effectively harnesses the rich semantic information available at different resolutions, facilitating a comprehensive and nuanced understanding of the input medical images. By capitalizing on global contextual connections, the network adeptly captures the intricate interplay of features and dependencies across the entire image, resulting in more accurate and robust segmentation outcomes. The experimental validation of PS5-Net underscores its superior performance in medical image segmentation tasks, offering promising prospects for enhancing diagnostic and analytical processes within clinical settings. These results highlight the potential of PS5-Net to significantly contribute to the advancement of medical imaging technologies and ultimately improve patient care through more precise and reliable image analysis.
Keywords: artificial intelligence; convolutional neural network; deep learning; multiscale feature fusion; semantic segmentation.
© 2024 Society of Photo-Optical Instrumentation Engineers (SPIE).