M3YOLOv5: Feature enhanced YOLOv5 model for mandibular fracture detection

Comput Biol Med. 2024 May:173:108291. doi: 10.1016/j.compbiomed.2024.108291. Epub 2024 Mar 20.

Abstract

Background: It is very important to detect mandibular fracture region. However, the size of mandibular fracture region is different due to different anatomical positions, different sites and different degrees of force. It is difficult to locate and recognize fracture region accurately.

Methods: To solve these problems, M3YOLOv5 model is proposed in this paper. Three feature enhancement strategies are designed, which improve the ability of model to locate and recognize mandibular fracture region. Firstly, Global-Local Feature Extraction Module (GLFEM) is designed. By effectively combining Convolutional Neural Network (CNN) and Transformer, the problem of insufficient global information extraction ability of CNN is complemented, and the positioning ability of the model to the fracture region is improved. Secondly, in order to improve the interaction ability of context information, Deep-Shallow Feature Interaction Module (DSFIM) is designed. In this module, the spatial information in the shallow feature layer is embedded to the deep feature layer by the spatial attention mechanism, and the semantic information in the deep feature layer is embedded to the shallow feature layer by the channel attention mechanism. The fracture region recognition ability of the model is improved. Finally, Multi-scale Multi receptive-field Feature Mixing Module (MMFMM) is designed. Deep separate convolution chains are used in this modal, which is composed by multiple layers of different scales and different dilation coefficients. This method provides richer receptive field for the model, and the ability to detect fracture region of different scales is improved.

Results: The precision rate, mAP value, recall rate and F1 value of M3YOLOv5 model on mandibular fracture CT data set are 97.18%, 96.86%, 94.42% and 95.58% respectively. The experimental results show that there is better performance about M3YOLOv5 model than the mainstream detection models.

Conclusion: The M3YOLOv5 model can effectively recognize and locate the mandibular fracture region, which is of great significance for doctors' clinical diagnosis.

Keywords: CNN; Deep separate convolution; Mandibular fracture; Transformer; YOLOv5.

MeSH terms

  • Humans
  • Information Storage and Retrieval
  • Mandibular Fractures* / diagnostic imaging
  • Neural Networks, Computer
  • Semantics