Purpose: An open question in deep clustering is how to explain what in the image is driving the cluster assignments. This is especially important for applications in medical imaging when the derived cluster assignments may inform decision-making or create new disease subtypes. We develop cluster activation mapping (CLAM), which is methodology to create localization maps highlighting the image regions important for cluster assignment. Approach: Our approach uses a linear combination of the activation channels from the last layer of the encoder within a pretrained autoencoder. The activation channels are weighted by a channelwise confidence measure, which is a modification of score-CAM. Results: Our approach performs well under medical imaging-based simulation experiments, when the image clusters differ based on size, location, and intensity of abnormalities. Under simulation, the cluster assignments were predicted with 100% accuracy when the number of clusters was set at the true value. In addition, applied to computed tomography scans from a sarcoidosis population, CLAM identified two subtypes of sarcoidosis based purely on CT scan presentation, which were significantly associated with pulmonary function tests and visual assessment scores, such as ground-glass, fibrosis, and honeycombing. Conclusions: CLAM is a transparent methodology for identifying explainable groupings of medical imaging data. As deep learning networks are often criticized and not trusted due to their lack of interpretability, our contribution of CLAM to deep clustering architectures is critical to our understanding of cluster assignments, which can ultimately lead to new subtypes of diseases.
Keywords: convolutional autoencoder; deep clustering; explainable machine learning; medical imaging; sarcoidoisis.
© 2022 Society of Photo-Optical Instrumentation Engineers (SPIE).