Domain generalization improves end-to-end object detection for real-time surgical tool detection

Int J Comput Assist Radiol Surg. 2023 May;18(5):939-944. doi: 10.1007/s11548-022-02823-9. Epub 2022 Dec 29.

Abstract

Purpose: Computer assistance for endoscopic surgery depends on knowledge about the contents in an endoscopic scene. An important step of analysing the video contents is real-time surgical tool detection. Most methods for tool detection nevertheless depend on multi-step algorithms building upon prior knowledge like anchor boxes or non-maximum suppression which ultimately decrease performance. A real-world difficulty encountered by learning-based methods are limited datasets. Training a neural network on data matching a specific distribution (e.g. from a single hospital or showing a specific type of surgery) can result in a lack of generalization.

Methods: In this paper, we propose the application of a transformer based architecture for end-to-end tool detection. This architecture promises state-of-the-art accuracy while decreasing the complexity resulting in improved run-time performance. To improve the lack of cross-domain generalization due to limited datasets, we enhance the architecture with a latent feature space via variational encoding to capture common intra-domain information. This feature space models the linear dependencies between domains by constraining their rank.

Results: The trained neural networks show a distinct improvement on out-of-domain data indicating better generalization to unseen domains. Inference with the end-to-end architecture can be performed at up to 138 frames per second (FPS) achieving a speedup in comparison to older approaches.

Conclusions: Experimental results on three representative datasets demonstrate the performance of the method. We also show that our approach leads to better domain generalization.

Keywords: Computer-assisted interventions; Deep learning; Domain generalization; Surgical tool detection.

MeSH terms

  • Algorithms*
  • Endoscopy
  • Humans
  • Neural Networks, Computer*