A Novel Encoder Decoder Architecture with Vision Transformer for Medical Image Segmentation

Saroj Bala; Kumud Arora; Jeevitha R; Rini Chowdhury; Prashant Kumar; C.Shobana Nageswari

doi:10.35882/jeeemi.v7i1.571

Saroj Bala Department of Master of Computer Applications, Ajay Kumar Garg Engineering College, Ghaziabad, India https://orcid.org/0000-0001-7002-9683
Kumud Arora Department of Computer Science Engineering- Artificial Intelligence & Machine Learning, Inderprastha Engineering College, Ghaziabad, India. https://orcid.org/0000-0001-9883-4676
Jeevitha R Department of Computer Science Engineering, KPR Institute of Engineering and Technology, Coimbatore, India. https://orcid.org/0000-0002-0173-6269
Rini Chowdhury Department of Information Technology Project Circle Bharath Sanchar Nigam Limited, Saltlake Telephone Exchange, Block DE, Lalkuthi, West Bengal, Kolkata, India https://orcid.org/0009-0009-6592-4216
Prashant Kumar Department of Information Technology Project Circle Bharath Sanchar Nigam Limited, Saltlake Telephone Exchange, Block DE, Lalkuthi, West Bengal, Kolkata, India https://orcid.org/0009-0007-3378-615X
C.Shobana Nageswari Department of Electronics and Communication Engineering, R.M.D Engineering College, Kavaraipettai, India. https://orcid.org/0000-0003-1150-2965

DOI: https://doi.org/10.35882/jeeemi.v7i1.571

Abstract

Brain tumor image segmentation is one of the most critical tasks in medical imaging for diagnosis, treatment planning, and prognosis. Traditional methods for brain tumor image segmentation are mostly based on Convolution Neural Network (CNN), which have been proved very powerful but still have limitations to effectively capture long-range dependencies and complex spatial hierarchies in MRI images. Variability in the shape, size, and location of tumors may affect the performance and may get stuck into suboptimal outcomes. In these regards, new encoder-decoder architecture with the VisionTranscoder(ViT) is proposed, to enhance brain tumor detection and classification. The proposed VisionTranscoder exploits a transformer's ability in modeling global context through self-attention mechanisms, providing more inclusive interpretation of the intricate patterns in medical images and classification by capturing both local and global features. The proposed VisionTranscoder maintains the Vision Transformer in its encoder for processing images as sequences of patches to capture global dependencies often outside the view of traditional CNNs. Then the segmentation map is rebuilt at a high level of fidelity with the decoder through upsampling and skips connections to maintain detailed spatial information. The risk of overfitting is hugely reduced by design and advanced regularization techniques with extensive data augmentation. The dataset contains 7,023 human brain MRI images, all of which are in four different classes: glioma, meningioma, no tumor, and pituitary. Images from the 'no tumor' class, indicating an MRI scan without any detectable tumor, were taken from the Br35H dataset . The results show the efficiency of VisionTranscoder over a wide set of brain MRI scans, producing an accuracy of 98.5% with a loss of 0.05. This performance underlines the ability of it to accurately segment and classify a brain tumor without overfitting.

Downloads

Download data is not yet available.

References

S., S., V., S. FACNN: fuzzy-based adaptive convolution neural network for classifying COVID-19 in noisy CXR images. Med BiolEngComput (2024). https://doi.org/10.1007/s11517-024-03107-x

Suganyadevi, S., Pershiya, A.S., Balasamy, K. et al. Deep Learning Based Alzheimer Disease Diagnosis: A Comprehensive Review. SN COMPUT. SCI. 5, 391 (2024). https://doi.org/10.1007/s42979-024-02743-2

Biratu, E. S., Schwenker, F., Ayano, Y. M., &Debelee, T. G. (2021). A survey of brain tumor segmentation and classification algorithms. Journal of Imaging, 7(9), 179.

Rao, C. S., &Karunakara, K. (2021). A comprehensive review on brain tumor segmentation and classification of MRI images. Multimedia Tools and Applications, 80(12), 17611-17643.

Zhu, Z., He, X., Qi, G., Li, Y., Cong, B., & Liu, Y. (2023). Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Information Fusion, 91, 376-387.

Shamia, D., Balasamy, K., Suganyadevi, S.: A secure framework for medical image by integrating watermarking and encryption through fuzzy based ROI selection. J. Intell. Fuzzy Syst. 44(5), 7449–7457 (2023)

Naser, M. A., &Deen, M. J. (2020). Brain tumor segmentation and grading of lower-grade glioma using deep learning in MRI images. Computers in biology and medicine, 121, 103758.

Gómez-Guzmán, M. A., Jiménez-Beristaín, L., García-Guerrero, E. E., López-Bonilla, O. R., Tamayo-Perez, U. J., Esqueda-Elizondo, J. J., ... &Inzunza-González, E. (2023). Classifying brain tumors on magnetic resonance imaging by using convolutional neural networks. Electronics, 12(4), 955.

Bal, A., Banerjee, M., Chakrabarti, A., & Sharma, P. (2022). MRI brain tumor segmentation and analysis using rough-fuzzy c-means and shape based properties. Journal of King Saud University-Computer and Information Sciences, 34(2), 115-133.

S. Suganyadevi, V. Seethalakshmi, K. Balasamy and N. Vidhya, "Deep learning in Covid-19 detection and diagnosis using CXR images: challenges and perspectives", Digital Twin Technologies for Healthcare, vol. 4, no. 046, pp. 163, 2023.

Kumar, K. A., Prasad, A. Y., &Metan, J. (2022). A hybrid deep CNN-Cov-19-Res-Net Transfer learning architype for an enhanced Brain tumor Detection and Classification scheme in medical image processing. Biomedical Signal Processing and Control, 76, 103631.

Krishnasamy, B., Balakrishnan, M., Christopher, A. (2021). A GeneticAlgorithm Based Medical Image Watermarking for ImprovingRobustness and Fidelity in Wavelet Domain. In: Satapathy, S., Zhang,YD., Bhateja, V., Majhi, R. (eds) Intelligent Data Engineering andAnalytics. Advances in Intelligent Systems and Computing, vol 1177.Springer, Singapore. https://doi.org/10.1007/978-981-15-5679-1_27.

Soltaninejad, M., Yang, G., Lambrou, T., Allinson, N., Jones, T. L., Barrick, T. R., ... & Ye, X. (2018). Supervised learning based multimodal MRI brain tumour segmentation using texture features from supervoxels. Computer methods and programs in biomedicine, 157, 69-84.

Dataset collection:kagge repository- https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset

Selvapandian, A., &Manivannan, K. (2018). Fusion based glioma brain tumor detection and segmentation using ANFIS classification. Computer methods and programs in biomedicine, 166, 33-38.

Suganyadevi Sellappan, A Anand Shiny Pershiy, Finney Daniel Shadrach, Krishnasamy. Balasamy, Karra. Renu and UmaamaheshvariAnnamalai, "A survey of Alzheimer's disease diagnosis using deep learning approaches", Journal of Autonomous Intelligence, vol. 7, no. 3, 2024.

Abdel-Maksoud, E., Elmogy, M., & Al-Awadi, R. (2015). Brain tumor segmentation based on a hybrid clustering technique. Egyptian Informatics Journal, 16(1), 71-81.

Rehman, Z. U., Naqvi, S. S., Khan, T. M., Khan, M. A., & Bashir, T. (2019). Fully automated multi-parametric brain tumour segmentation using superpixel based classification. Expert systems with applications, 118, 598-613.

Ranjbarzadeh, R., BagherianKasgari, A., JafarzadehGhoushchi, S., Anari, S., Naseri, M., &Bendechache, M. (2021). Brain tumor segmentation based on deep learning and an attention mechanism using MRI multi-modalities brain images. Scientific Reports, 11(1), 1-17.

Sharif, M. I., Li, J. P., Khan, M. A., &Saleem, M. A. (2020). Active deep neural network features selection for segmentation and recognition of brain tumors using MRI images. Pattern Recognition Letters, 129, 181-189.

Allah, A. M. G., Sarhan, A. M., &Elshennawy, N. M. (2023). Edge U-Net: Brain tumor segmentation using MRI based on deep U-Net model with boundary information. Expert Systems with Applications, 213, 118833.

Suganyadevi, S., Pershiya, A.S., Balasamy, K. et al. Deep Learning Based Alzheimer Disease Diagnosis: A Comprehensive Review. SN COMPUT. SCI. 5, 391 (2024). https://doi.org/10.1007/s42979-024-02743-2

Khairandish, M. O., Sharma, M., Jain, V., Chatterjee, J. M., &Jhanjhi, N. Z. (2022). A hybrid CNN-SVM threshold segmentation approach for tumor detection and classification of MRI brain images. Irbm, 43(4), 290-299.

Hussain, S., Anwar, S. M., & Majid, M. (2018). Segmentation of glioma tumors in brain using deep convolutional neural network. Neurocomputing, 282, 248-261.

Chen, S., Ding, C., & Liu, M. (2019). Dual-force convolutional neural networks for accurate brain tumor segmentation. Pattern Recognition, 88, 90-100.

Balasamy, K., Suganyadevi, S. Multi-dimensional fuzzy based diabetic retinopathy detection in retinal images through deep CNN method. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19798-1

Ghassemi, N., Shoeibi, A., & Rouhani, M. (2020). Deep neural network with generative adversarial networks pre-training for brain tumor classification based on MR images. Biomedical Signal Processing and Control, 57, 101678.

Öksüz, C., Urhan, O., &Güllü, M. K. (2022). Brain tumor classification using the fused features extracted from expanded tumor region. Biomedical Signal Processing and Control, 72, 103356.

Balasamy, K., Seethalakshmi, V. &Suganyadevi, S. Medical Image Analysis Through Deep Learning Techniques: A Comprehensive Survey. Wireless PersCommun 137, 1685–1714 (2024). https://doi.org/10.1007/s11277-024-11428-1

Sompong, C., &Wongthanavasu, S. (2017). An efficient brain tumor segmentation based on cellular automata and improved tumor-cut algorithm. Expert Systems with Applications, 72, 231-244.

S., S., V., S. FACNN: fuzzy-based adaptive convolution neural network for classifying COVID-19 in noisy CXR images. Med BiolEngComput 62, 2893–2909 (2024). https://doi.org/10.1007/s11517-024-03107-x.

Suganyadevi, S., Seethalakshmi, V. Deep recurrent learning based qualified sequence segment analytical model (QS2AM) for infectious disease detection using CT images. Evolving Systems 15, 505–521 (2024). https://doi.org/10.1007/s12530-023-09554-5.

Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., & Zhou, S. (2021). TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. ArXiv preprint arXiv:2102.04306. Available at: https://arxiv.org/abs/2102.04306

Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., & Wang, M. (2021). Swin-Unet: Swin Transformer for Medical Image Segmentation. ArXiv preprint arXiv:2105.05537. Available at: https://arxiv.org/abs/2105.05537

Xing, W., Wang, F., Liu, X., & Li, Z. (2022). CS-Unet: A Compact Skip-Connected UNet for Medical Image Segmentation. ArXiv preprint arXiv:2210.08066. Available at: https://arxiv.org/abs/2210.08066

Zhou, Y., Li, J., Wang, X., Feng, Y., & Zhang, Y. (2022). MedFormer: A Data-scalable Transformer for Medical Image Segmentation. ArXiv preprint arXiv:2203.00131. Available at: https://arxiv.org/abs/2203.00131