Memory-efficient 2.5D convolutional transformer networks for multi-modal deformable registration with weak label supervision applied to whole-heart CT and MRI scans

Int J Comput Assist Radiol Surg. 2019 Nov;14(11):1901-1912. doi: 10.1007/s11548-019-02068-z. Epub 2019 Sep 19.


PURPOSE : Despite its potential for improvements through supervision, deep learning-based registration approaches are difficult to train for large deformations in 3D scans due to excessive memory requirements. METHODS : We propose a new 2.5D convolutional transformer architecture that enables us to learn a memory-efficient weakly supervised deep learning model for multi-modal image registration. Furthermore, we firstly integrate a volume change control term into the loss function of a deep learning-based registration method to penalize occurring foldings inside the deformation field. RESULTS : Our approach succeeds at learning large deformations across multi-modal images. We evaluate our approach on 100 pair-wise registrations of CT and MRI whole-heart scans and demonstrate considerably higher Dice Scores (of 0.74) compared to a state-of-the-art unsupervised discrete registration framework (deeds with Dice of 0.71). CONCLUSION : Our proposed memory-efficient registration method performs better than state-of-the-art conventional registration methods. By using a volume change control term in the loss function, the number of occurring foldings can be considerably reduced on new registration cases.

Keywords: 2.5D; CT; Convolutional neural networks; MRI; Multi-modal registration; Weakly supervised learning.

MeSH terms

  • Deep Learning*
  • Equipment Design
  • Heart / diagnostic imaging*
  • Humans
  • Magnetic Resonance Imaging / instrumentation*
  • Neural Networks, Computer*
  • Phantoms, Imaging*
  • Tomography, X-Ray Computed / instrumentation*