End to end vision transformer architecture for brain stroke assessment based on multi-slice classification and localization using computed tomography

Muhammad Ayoub; Zhifang Liao; Shabir Hussain; Lifeng Li; Chris W J Zhang; Kelvin K L Wong

doi:10.1016/j.compmedimag.2023.102294

End to end vision transformer architecture for brain stroke assessment based on multi-slice classification and localization using computed tomography

Comput Med Imaging Graph. 2023 Oct:109:102294. doi: 10.1016/j.compmedimag.2023.102294. Epub 2023 Sep 6.

Authors

Muhammad Ayoub¹, Zhifang Liao¹, Shabir Hussain², Lifeng Li³, Chris W J Zhang⁴, Kelvin K L Wong⁵

Affiliations

¹ School of Computer Science and Engineering, Central South University, Changsha 410017, Hunan, China.
² Department of Computer Science, National College of Business Administration and Economics, Lahore, Punjab, 05499, Pakistan.
³ Department of Radiology, The Affiliated Changsha Central Hospital, Hengyang Medical School, University of South China, Changsha 410017, China.
⁴ Department of Mechanical Engineering, College of Engineering, University of Saskatchewan, S7N 5A9 Saskatoon, SK, Canada.
⁵ Department of Mechanical Engineering, College of Engineering, University of Saskatchewan, S7N 5A9 Saskatoon, SK, Canada. Electronic address: kelvin.wong@usask.ca.

PMID: 37713999
DOI: 10.1016/j.compmedimag.2023.102294

Abstract

Background: Brain stroke is a leading cause of disability and death worldwide, and early diagnosis and treatment are critical to improving patient outcomes. Current stroke diagnosis methods are subjective and prone to errors, as radiologists rely on manual selection of the most important CT slice. This highlights the need for more accurate and reliable automated brain stroke diagnosis and localization methods to improve patient outcomes.

Purpose: In this study, we aimed to enhance the vision transformer architecture for the multi-slice classification of CT scans of each patient into three categories, including Normal, Infarction, Hemorrhage, and patient-wise stroke localization, based on end-to-end vision transformer architecture. This framework can provide an automated, objective, and consistent approach to stroke diagnosis and localization, enabling personalized treatment plans based on the location and extent of the stroke.

Methods: We modified the Vision Transformer (ViT) in combination with neural network layers for the multi-slice classification of brain CT scans of each patient into normal, infarction, and hemorrhage classes. For stroke localization, we used the ViT architecture and convolutional neural network layers to detect stroke and localize it by bounding boxes for infarction and hemorrhage regions in a patient-wise manner based on multi slices.

Results: Our proposed framework achieved an overall accuracy of 87.51% in classifying brain CT scan slices and showed high precision in localizing the stroke patient-wise. Our results demonstrate the potential of our method for accurate and reliable stroke diagnosis and localization.

Conclusion: Our study enhanced ViT architecture for automated stroke diagnosis and localization using brain CT scans, which could have significant implications for stroke management and treatment. The use of deep learning algorithms can provide a more objective and consistent approach to stroke diagnosis and potentially enable personalized treatment plans based on the location and extent of the stroke. Further studies are needed to validate our method on larger and more diverse datasets and to explore its clinical utility in real-world settings.

Keywords: Brain stroke; Classification and detection; Deep learning; Self-attention mechanism; Vision transformers architecture.

MeSH terms

Brain* / diagnostic imaging
Hemorrhage
Humans
Infarction
Stroke* / diagnostic imaging
Tomography, X-Ray Computed