(1) Background: Contact Endoscopy (CE) and Narrow Band Imaging (NBI) are optical imaging modalities that can provide enhanced and magnified visualization of the superficial vascular networks in the laryngeal mucosa. The similarity of vascular structures between benign and malignant lesions causes a challenge in the visual assessment of CE-NBI images. The main objective of this study is to use Deep Convolutional Neural Networks (DCNN) for the automatic classification of CE-NBI images into benign and malignant groups with minimal human intervention. (2) Methods: A pretrained Res-Net50 model combined with the cut-off-layer technique was selected as the DCNN architecture. A dataset of 8181 CE-NBI images was used during the fine-tuning process in three experiments where several models were generated and validated. The accuracy, sensitivity, and specificity were calculated as the performance metrics in each validation and testing scenario. (3) Results: Out of a total of 72 trained and tested models in all experiments, Model 5 showed high performance. This model is considerably smaller than the full ResNet50 architecture and achieved the testing accuracy of 0.835 on the unseen data during the last experiment. (4) Conclusion: The proposed fine-tuned ResNet50 model showed a high performance to classify CE-NBI images into the benign and malignant groups and has the potential to be part of an assisted system for automatic laryngeal cancer detection.
Keywords: Deep Convolution Neural Network; cancer; classification; contact endoscopy; larynx; narrow band imaging.