Hierarchical Swin Transformer for Multi-Stage Dementia Diagnosis with Clinically-Grounded Visual Explainability

Amr A. Hassanain; Tan Wei Hong; Rajeev Kumar; Mona Ali Abdelrahman

doi:10.66279/j4m1km41

Authors

Amr A. Hassanain Sphinx University Author
Tan Wei Hong University of Malaysia, Perlis Author
Rajeev Kumar Moradabad Institute of Technology Author
Mona Ali Abdelrahman American University in the Emirates Author

DOI:

https://doi.org/10.66279/j4m1km41

Keywords:

Dementia Detection, Alzheimer’s Disease, Brain MRI, Swin Transformer, Explainable AI

Abstract

This paper presents a novel multi-stage dementia diagnosis framework integrating a Swin Transformer architecture with explainable AI for brain MRI analysis. The proposed approach addresses two critical challenges: capturing both local and global structural features through hierarchical Vision Transformer processing, and providing clinically interpretable decisions via Grad-CAM visualization.
Our model was evaluated on a Kaggle dataset comprising 6,400 MRI images across four dementia stages: non-demented (3,200), very mild (2,240), mild (896), and moderate (64). The dataset was split into 70% training, 15% validation, and 15% testing. Experimental results demonstrate superior performance with 97.3% accuracy, precision ranging from 94.8-100%, recall between 91.1-100%, and a macro F1-score of 96.5%. Statistical validation through 5-fold cross-validation (96.8% ± 0.4%) confirms robustness.
The SwinGrad-CAM component successfully identifies clinically relevant biomarkers, including hippocampal atrophy and ventricular enlargement, aligning with established neurological indicators. For very mild cases, heatmaps highlight early temporal lobe changes, while moderate cases show intense activation in regions with severe cortical atrophy. This interpretable AI framework offers a robust solution for early intervention, precise staging, and personalized treatment planning in dementia care, enabling clinicians to make informed decisions through visual validation of model reasoning while bridging the gap between deep learning performance and clinical trust.

Downloads

Download data is not yet available.

References

[1] G. Livingston et al., “Dementia prevention, intervention, and care: 2020 report of the Lancet commission,” The Lancet, vol. 396, no. 10248, pp. 413–446, 2020. DOI: https://doi.org/10.1016/S0140-6736(20)30367-6

[2] Alzheimer’s Disease International, “World Alzheimer Report 2023: Reducing dementia risk — never too early, never too late,” ADI Report, 2023.

[3] B. Dubois et al., “Advancing research diagnostic criteria for Alzheimer’s disease: The IWG-2 criteria,” The Lancet Neurology, vol. 13, no. 6, pp. 614–629, 2014.

[4] G. B. Frisoni et al., “Clinical use of structural MRI in Alzheimer disease,” Nature Reviews Neurology, vol. 6, no. 2, pp. 67–77, 2010. DOI: https://doi.org/10.1038/nrneurol.2009.215

[5] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015. DOI: https://doi.org/10.1038/nature14539

[6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2012, pp. 1097–1105.

[7] Z. Liu et al., “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10012–10022. DOI: https://doi.org/10.1109/ICCV48922.2021.00986

[8] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations (ICLR), 2021.

[9] J. Zhou et al., “A deep learning model for early diagnosis of Alzheimer’s disease combined with 3D CNN and video Swin transformer,” Scientific Reports, vol. 15, p. 23311, 2025. DOI: https://doi.org/10.1038/s41598-025-05568-y

[10] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. DOI: https://doi.org/10.1109/ICCV.2017.74

[11] W. Samek, T. Wiegand, and K. R. Müller, “Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models,” IT - Information Technology, vol. 61, no. 4, pp. 224–232, 2019.

[12] S. Mohsen, “Alzheimer’s disease detection: Review with deep learning and machine learning,” Artificial Intelligence Review, vol. 58, p. 11258, 2025. DOI: https://doi.org/10.1007/s10462-025-11258-y

[13] T. O. Frizzell et al., “Artificial intelligence in brain MRI analysis of Alzheimer’s disease over the past 12 years: A systematic review,” Ageing Research Reviews, vol. 77, p. 101614, 2022. DOI: https://doi.org/10.1016/j.arr.2022.101614

[14] V. Mubonanyikuzo et al., “Detection of Alzheimer disease in neuroimages using vision transformers: Systematic review and meta-analysis,” Journal of Medical Internet Research, vol. 27, pp. 1–16, 2025. DOI: https://doi.org/10.2196/62647

[15] I. Afifi, M. Elgendy, M. Abdelfatah, and S. El-Sappagh, “Vision and convolutional transformers for Alzheimer’s disease diagnosis: A systematic review of architectures, multimodal fusion and critical gaps,” Brain Informatics, vol. 13, 2025. DOI: https://doi.org/10.1186/s40708-025-00286-7

[16] Y. Wang, H. Sheng, and X. Wang, “Recognition and diagnosis of Alzheimer’s disease using T1-weighted MRI via integrating CNN

and Swin vision transformer,” Clinics, vol. 80, p. 100673, 2025. DOI: https://doi.org/10.1016/j.clinsp.2025.100673

[17] Z. Hu, Y. Li, Z. Wang, S. Zhang, and W. Hou, “Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer’s disease classification,” Computers in Biology and Medicine, vol. 164, p. 107304, 2023. DOI: https://doi.org/10.1016/j.compbiomed.2023.107304

[18] M. L. Raza et al., “Advancements in deep learning for early diagnosis of Alzheimer’s disease using multimodal neuroimaging,” Frontiers in Neuroinformatics, vol. 19, 2025. DOI: https://doi.org/10.3389/fninf.2025.1557177

[19] J. Xin et al., “CNN and swin-transformer based efficient model for Alzheimer’s disease diagnosis with sMRI,” Biomedical Signal Processing and Control, vol. 86, p. 105179, 2023. DOI: https://doi.org/10.1016/j.bspc.2023.105189

[20] A. Saoud and H. AlMarzouqi, “Explainable early detection of Alzheimer’s using ensemble of 3D vision transformers,” Scientific Reports, vol. 14, p. 23321, 2024. DOI: https://doi.org/10.21203/rs.3.rs-3829295/v1

[21] K. Velu and N. Jaisankar, “Design of a CNN–Swin transformer model for Alzheimer’s disease prediction using MRI images,” IEEE Access, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3602316

[22] S. Sivakumar, P. P. D. Sri, G. Prasuna, and M. Harini, “Multi-stage Alzheimer’s classification using MRI scans: A deep learning approach,” in 2025 International Conference on Intelligent Innovations in Engineering and Technology (ICIIET), 2025, pp. 1–6. DOI: https://doi.org/10.1109/ICIIET65921.2025.11379223

[23] M. H. Bhuiyan, S. Haldar, M. S. Chowdhury, N. Bushra, and T. Z. Jilan, “An interpretable diagnosis of retinal diseases using vision transformer and Grad-CAM,” Ph.D. dissertation, Brac University, 2024.

[24] A. Gupta et al., “Lightweight hierarchical models for multi-class dementia classification using Kaggle MRI data,” arXiv preprint arXiv:2501.01234, 2025.

[25] Q. Wu, Y. Wang, X. Zhang, H. Zhang, and K. Che, “A hybrid transformer-based approach for early detection of Alzheimer’s disease using MRI images,” Bioimpacts, vol. 15, p. 30849, Apr. 2025. DOI: https://doi.org/10.34172/bi.30849

[26] Q. Dessain, N. Delinte, B. Hanseeuw, L. Dricot, and B. Macq, “Leveraging Swin transformer for enhanced diagnosis of Alzheimer’s disease using multi-shell diffusion MRI,” IEEE Transactions on Biomedical Engineering, 2025. DOI: https://doi.org/10.1109/TBME.2025.3636745

[27] M. Pinamonti, “Alzheimer’s dataset (4 class of images),” Kaggle, 2023.

[28] P. Purwono, A. N. E. Wulandari, and K. Nisa, “Explainable artificial intelligence (XAI) in medical imaging: Techniques, applications, challenges, and future directions,” Advanced Mechanical and Mechatronic Systems, vol. 1, no. 1, pp. 52–66, 2025. DOI: https://doi.org/10.53623/amms.v1i1.692