Recent trials have evaluated the efficacy of deep convolutional neural network (CNN)-based AI systems to improve lesion detection and characterization in endoscopy. Impressive results are achieved, but many medical studies use a very small image resolution to save computing resources at the cost of losing details. Today, no conventions between resolution and performance exist, and monitoring the performance of various CNN architectures as a function of image resolution provides insights into how subtleties of different lesions on endoscopy affect performance. This can help set standards for image or video characteristics for future CNN-based models in gastrointestinal (GI) endoscopy. This study examines the performance of CNNs on the HyperKvasir dataset, consisting of 10,662 images from 23 different findings. We evaluate two CNN models for endoscopic image classification under quality distortions with image resolutions ranging from 32 × 32 to 512 × 512 pixels. The performance is evaluated using two-fold cross-validation and F1-score, maximum Matthews correlation coefficient (MCC), precision, and sensitivity as metrics. Increased performance was observed with higher image resolution for all findings in the dataset. MCC was achieved at image resolutions between 512 × 512 pixels for classification for the entire dataset after including all subclasses. The highest performance was observed with an MCC value of 0.9002 when the models were trained on the highest resolution and tested on the same resolution. Different resolutions and their effect on CNNs are explored. We show that image resolution has a clear influence on the performance which calls for standards in the field in the future.
Keywords: convolutional neural networks; endoscopic images; image resolution.