A Guide to Annotation of Neurosurgical Intraoperative Video for Machine Learning Analysis and Computer Vision

Dhiraj J Pangal; Guillaume Kugener; Shane Shahrestani; Frank Attenello; Gabriel Zada; Daniel A Donoho

doi:10.1016/j.wneu.2021.03.022

A Guide to Annotation of Neurosurgical Intraoperative Video for Machine Learning Analysis and Computer Vision

World Neurosurg. 2021 Jun:150:26-30. doi: 10.1016/j.wneu.2021.03.022. Epub 2021 Mar 17.

Authors

Dhiraj J Pangal¹, Guillaume Kugener², Shane Shahrestani³, Frank Attenello², Gabriel Zada², Daniel A Donoho⁴

Affiliations

¹ Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA. Electronic address: pangal@usc.edu.
² Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA.
³ Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA; Department of Medical Engineering, California Institute of Technology, Pasadena, California, USA.
⁴ Department of Neurosurgery, Keck School of Medicine, University of Southern California, Los Angeles, California, USA; Division of Neurosurgery, Department of Surgery, Texas Children's Hospital, Baylor College of Medicine, Houston, Texas, USA.

PMID: 33722717
DOI: 10.1016/j.wneu.2021.03.022

Abstract

Objective: Computer vision (CV) is a subset of artificial intelligence that performs computations on image or video data, permitting the quantitative analysis of visual information. Common CV tasks that may be relevant to surgeons include image classification, object detection and tracking, and extraction of higher order features. Despite the potential applications of CV to intraoperative video, however, few surgeons describe the use of CV. A primary roadblock in implementing CV is the lack of a clear workflow to create an intraoperative video dataset to which CV can be applied. We report general principles for creating usable surgical video datasets and the result of their applications.

Methods: Video annotations from cadaveric endoscopic endonasal skull base simulations (n = 20 trials of 1-5 minutes, size = 8 GB) were reviewed by 2 researcher-annotators. An internal, retrospective analysis of workflow for development of the intraoperative video annotations was performed to identify guiding practices.

Results: Approximately 34,000 frames of surgical video were annotated. Key considerations in developing annotation workflows include 1) overcoming software and personnel constraints; 2) ensuring adequate storage and access infrastructure; 3) optimization and standardization of annotation protocol; and 4) operationalizing annotated data. Potential tools for use include CVAT (Computer Vision Annotation Tool) and Vott: open-sourced annotation software allowing for local video storage, easy setup, and the use of interpolation.

Conclusions: CV techniques can be applied to surgical video, but challenges for novice users may limit adoption. We outline principles in annotation workflow that can mitigate initial challenges groups may have when converting raw video into useable, annotated datasets.

Keywords: Artificial intelligence; Computer vision; Intraoperative video; Machine learning.

Publication types

Video-Audio Media

MeSH terms

Artificial Intelligence*
Data Collection
Humans
Machine Learning*
Surgery, Computer-Assisted / methods*