Improving dental disease diagnosis using a cross attention based hybrid model of DeiT and CoAtNet

Sci Rep. 2026 Jan 6;16(1):805. doi: 10.1038/s41598-025-32514-9.

Abstract

Accurate dental diagnosis is essential for effective treatment planning and improving patient outcomes, particularly in identifying various dental diseases, such as cavities, fillings, implants, and impacted teeth. This study proposes a new hybrid model that integrates the strengths of the data-efficient image transformer (DeiT) and convolutional attention network (CoAtNet) to enhance diagnostic accuracy. Our approach's first step involves preprocessing dental radiographic images to improve their quality and enhance feature extraction. The model employs a cross-attention fusion mechanism that aligns and merges feature representations from DeiT and CoAtNet, leveraging their unique capabilities to capture relevant patterns in the data. A stacking classifier, comprising base classifiers such as support vector machines (SVM), eXtreme gradient boosting (XGBoost), and multilayer perceptron (MLP), optimizes classification performance by combining predictions from multiple models. The proposed model demonstrates superior performance, achieving an accuracy of 96%, a precision of 96.5%, 96.1% for sensitivity, 96.4% for specificity, and 96.3% for Dice similarity coefficient, thus showcasing its effectiveness in the automatic diagnosis of dental diseases.

Keywords: Convolutional attention network (CoAtNet); Cross-attention fusion; Data-efficient image transformer (DeiT); Dental X-ray scans; Dental diagnosis.

MeSH terms

  • Algorithms
  • Deep Learning
  • Humans
  • Image Processing, Computer-Assisted* / methods
  • Neural Networks, Computer
  • Stomatognathic Diseases* / diagnosis
  • Stomatognathic Diseases* / diagnostic imaging
  • Support Vector Machine