Microsatellite instability represents a key biomarker in gastrointestinal cancers with significant diagnostic and therapeutic implications. Traditional molecular assays for microsatellite instability detection, while effective, are costly, time-consuming, and require specialized infrastructure. In this paper we propose an explainable deep learning-based method for microsatellite instability detection starting from the analysis of histopathological images. We consider a set of convolutional neural network architectures i.e., MobileNet, Inception, VGG16, VGG19, and a Vision Transformer model, and we propose a way to provide a kind of clinical explainability behind the model prediction through (three) Class Activation Mapping techniques. With the aim to further strengthen trustworthiness in predictions, we introduce a set of robustness metrics aimed to quantify the consistency of highlighted discriminative regions across different Class Activation Mapping methods. Experimental results on a real-world dataset demonstrate that VGG16 and VGG19 models achieve the best performance in terms of accuracy; in particular, the VGG16 model obtains an accuracy of 0.926, while the VGG19 one reaches an accuracy equal to 0.917. Furthermore, Class Activation Mapping techniques confirmed that the developed models consistently focus on similar tissue regions, while robustness analysis highlighted high agreement between different Class Activation Mapping techniques. These results indicate that the proposed method not only achieves interesting predictive accuracy but also provides explainable predictions, with the aim to boost the integration of deep learning into real-world clinical practice.
Keywords: Class Activation Mapping; convolutional neural network; deep learning; explainability; microsatellite instability.