Objectives: To retrospectively assess radiographic data and to prospectively classify radiographs (namely, panoramic, bitewing, periapical, and cephalometric images), we compared three deep learning architectures for their classification performance.
Methods: Our dataset consisted of 31,288 panoramic, 43,598 periapical, 14,326 bitewing, and 1176 cephalometric radiographs from two centers (Berlin/Germany; Lucknow/India). For a subset of images L (32,381 images), image classifications were available and manually validated by an expert. The remaining subset of images U was iteratively annotated using active learning, with ResNet-34 being trained on L, least confidence informative sampling being performed on U, and the most uncertain image classifications from U being reviewed by a human expert and iteratively used for re-training. We then employed a baseline convolutional neural networks (CNN), a residual network (another ResNet-34, pretrained on ImageNet), and a capsule network (CapsNet) for classification. Early stopping was used to prevent overfitting. Evaluation of the model performances followed stratified k-fold cross-validation. Gradient-weighted Class Activation Mapping (Grad-CAM) was used to provide visualizations of the weighted activations maps.
Results: All three models showed high accuracy (>98%) with significantly higher accuracy, F1-score, precision, and sensitivity of ResNet than baseline CNN and CapsNet (p < 0.05). Specificity was not significantly different. ResNet achieved the best performance at small variance and fastest convergence. Misclassification was most common between bitewings and periapicals. For bitewings, model activation was most notable in the inter-arch space for periapicals interdentally, for panoramics on bony structures of maxilla and mandible, and for cephalometrics on the viscerocranium.
Conclusions: Regardless of the models, high classification accuracies were achieved. Image features considered for classification were consistent with expert reasoning.
Keywords: artificial intelligence; classification deep learning; dental; machine learning; radiographs; teeth.