Development and Validation of a Model for Laparoscopic Colorectal Surgical Instrument Recognition Using Convolutional Neural Network-Based Instance Segmentation and Videos of Laparoscopic Procedures

Daichi Kitaguchi; Younae Lee; Kazuyuki Hayashi; Kei Nakajima; Shigehiro Kojima; Hiro Hasegawa; Nobuyoshi Takeshita; Kensaku Mori; Masaaki Ito

doi:10.1001/jamanetworkopen.2022.26265

Development and Validation of a Model for Laparoscopic Colorectal Surgical Instrument Recognition Using Convolutional Neural Network-Based Instance Segmentation and Videos of Laparoscopic Procedures

JAMA Netw Open. 2022 Aug 1;5(8):e2226265. doi: 10.1001/jamanetworkopen.2022.26265.

Authors

Daichi Kitaguchi^{1

2}, Younae Lee¹, Kazuyuki Hayashi¹, Kei Nakajima^{1

2}, Shigehiro Kojima^{1

2}, Hiro Hasegawa^{1

2}, Nobuyoshi Takeshita^{1

2}, Kensaku Mori³, Masaaki Ito^{1

2}

Affiliations

¹ Surgical Device Innovation Office, National Cancer Center Hospital East, Kashiwanoha, Kashiwa, Chiba, Japan.
² Department of Colorectal Surgery, National Cancer Center Hospital East, Kashiwanoha, Kashiwa, Chiba, Japan.
³ Graduate School of Informatics, Nagoya University, Nagoya, Aichi, Japan.

Abstract

Importance: Deep learning-based automatic surgical instrument recognition is an indispensable technology for surgical research and development. However, pixel-level recognition with high accuracy is required to make it suitable for surgical automation.

Objective: To develop a deep learning model that can simultaneously recognize 8 types of surgical instruments frequently used in laparoscopic colorectal operations and evaluate its recognition performance.

Design, setting, and participants: This quality improvement study was conducted at a single institution with a multi-institutional data set. Laparoscopic colorectal surgical videos recorded between April 1, 2009, and December 31, 2021, were included in the video data set. Deep learning-based instance segmentation, an image recognition approach that recognizes each object individually and pixel by pixel instead of roughly enclosing with a bounding box, was performed for 8 types of surgical instruments.

Main outcomes and measures: Average precision, calculated from the area under the precision-recall curve, was used as an evaluation metric. The average precision represents the number of instances of true-positive, false-positive, and false-negative results, and the mean average precision value for 8 types of surgical instruments was calculated. Five-fold cross-validation was used as the validation method. The annotation data set was split into 5 segments, of which 4 were used for training and the remainder for validation. The data set was split at the per-case level instead of the per-frame level; thus, the images extracted from an intraoperative video in the training set never appeared in the validation set. Validation was performed for all 5 validation sets, and the average mean average precision was calculated.

Results: In total, 337 laparoscopic colorectal surgical videos were used. Pixel-by-pixel annotation was manually performed for 81 760 labels on 38 628 static images, constituting the annotation data set. The mean average precisions of the instance segmentation for surgical instruments were 90.9% for 3 instruments, 90.3% for 4 instruments, 91.6% for 6 instruments, and 91.8% for 8 instruments.

Conclusions and relevance: A deep learning-based instance segmentation model that simultaneously recognizes 8 types of surgical instruments with high accuracy was successfully developed. The accuracy was maintained even when the number of types of surgical instruments increased. This model can be applied to surgical innovations, such as intraoperative navigation and surgical automation.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Automation
Colorectal Neoplasms*
Humans
Laparoscopy* / methods
Neural Networks, Computer
Surgical Instruments