Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning

S M Kamrul Hasan; Cristian A Linte

doi:10.3390/app122312163

Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning

Appl Sci (Basel). 2022 Dec 1;12(23):12163. doi: 10.3390/app122312163. Epub 2022 Nov 28.

Authors

S M Kamrul Hasan¹, Cristian A Linte^{1

2}

Affiliations

¹ Center for Imaging Science, Rochester Institute of Technology, Rochester, NY 14623, USA.
² Department of Biomedical Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.

Abstract

Learning good data representations for medical imaging tasks ensures the preservation of relevant information and the removal of irrelevant information from the data to improve the interpretability of the learned features. In this paper, we propose a semi-supervised model-namely, combine-all in semi-supervised learning (CqSL)-to demonstrate the power of a simple combination of a disentanglement block, variational autoencoder (VAE), generative adversarial network (GAN), and a conditioning layer-based reconstructor for performing two important tasks in medical imaging: segmentation and reconstruction. Our work is motivated by the recent progress in image segmentation using semi-supervised learning (SSL), which has shown good results with limited labeled data and large amounts of unlabeled data. A disentanglement block decomposes an input image into a domain-invariant spatial factor and a domain-specific non-spatial factor. We assume that medical images acquired using multiple scanners (different domain information) share a common spatial space but differ in non-spatial space (intensities, contrast, etc.). Hence, we utilize our spatial information to generate segmentation masks from unlabeled datasets using a generative adversarial network (GAN). Finally, to reconstruct the original image, our conditioning layer-based reconstruction block recombines spatial information with random non-spatial information sampled from the generative models. Our ablation study demonstrates the benefits of disentanglement in holding domain-invariant (spatial) as well as domain-specific (non-spatial) information with high accuracy. We further apply a structured $L_{2}$ similarity $(S L_{2} SIM)$ loss along with a mutual information minimizer (MIM) to improve the adversarially trained generative models for better reconstruction. Experimental results achieved on the STACOM 2017 ACDC cine cardiac magnetic resonance (MR) dataset suggest that our proposed (CqSL) model outperforms fully supervised and semi-supervised models, achieving an 83.2% performance accuracy even when using only 1% labeled data. We hypothesize that our proposed model has the potential to become an efficient semantic segmentation tool that may be used for domain adaptation in data-limited medical imaging scenarios, where annotations are expensive. Code, and experimental configurations will be made available publicly.

Keywords: augmentation; cardiac segmentation; disentangled representation; domain invariant features; generative adversarial network; image quality; mutual information; reconstruction; variational autoencoder.

Grants and funding

R35 GM128877/GM/NIGMS NIH HHS/United States