HST-Net: Hierarchical Spectrum-Tokenization with Progressive Refinement for Cardiac MRI Segmentation
Abstract
The accurate segmentation of cardiac structures from Magnetic Resonance Imaging (MRI) plays a vital role in quantitative ventricular assessment, functional analysis, and the clinical diagnosis of cardiovascular diseases. Precise delineation of cardiac components, such as the left ventricle, right ventricle, and myocardial wall, is essential for evaluating cardiac morphology and function. In recent years, transformer-based architectures, including TransUNet and Swin-UNet, have demonstrated strong capabilities in modeling long-range dependencies and capturing global contextual information. However, despite these advantages, they often struggle to preserve smooth anatomical geometry and achieve high-precision boundary delineation, particularly in the presence of large shape deformations and significant inter-subject variability commonly observed in cardiac MRI data. To overcome these limitations, a Hierarchical Spectrum-Tokenization Network (HST-Net) is proposed. The core idea of HST-Net is to represent cardiac anatomy at multiple levels of granularity, enabling a more robust structural understanding across varying spatial scales. The proposed architecture incorporates a novel approach called Spectrum Tokenization. This approach divides the latent representations into two parts, one containing low-frequency global tokens that capture context information, and another containing high-frequency boundary-aware tokens that capture the contours. By progressively enhancing boundary details, PSR significantly improves contour accuracy, especially for complex and thin structures. Experimental evaluations conducted on a cardiac MRI dataset demonstrate the effectiveness of the proposed approach. HST-Net achieves an average Dice coefficient of 91.6% and a pixel-wise segmentation accuracy of 94.8%. Compared to nnU-Net and Swin-UNet, it shows consistent performance gains, yielding improvements of 2.1–3.4% in Dice score and 1.9–2.6% in segmentation accuracy across different cardiac structures.
Downloads
References
Petitjean, C., Zuluaga, M. A., Bai, W., Dacher, J. N., Grosgeorge, D., Caudron, J., ... & Yuan, J. (2015). Right ventricle segmentation from cardiac MRI: a collation study. Medical image analysis, 19(1), 187-202. https://doi.org/10.1016/j.media.2014.10.004
Wei, D., Li, C., & Sun, Y. (2015). Medical image segmentation and its application in cardiac MRI. Biomedical Image Understanding, 47-89. https://doi.org/10.1002/9781118715321.ch2
Zotti, C., Luo, Z., Lalande, A., & Jodoin, P. M. (2018). Convolutional neural network with shape prior applied to cardiac MRI segmentation. IEEE journal of biomedical and health informatics, 23(3), 1119-1128. https://doi.org/10.1109/JBHI.2018.2865450
Peng, P., Lekadir, K., Gooya, A., Shao, L., Petersen, S. E., & Frangi, A. F. (2016). A review of heart chamber segmentation for structural and functional analysis using cardiac magnetic resonance imaging. Magnetic Resonance Materials in Physics, Biology and Medicine, 29(2), 155-195. https://doi.org/10.1007/s10334-015-0521-4
Earls, J. P., Ho, V. B., Foo, T. K., Castillo, E., & Flamm, S. D. (2002). Cardiac MRI: recent progress and continued challenges. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 16(2), 111-127. https://doi.org/10.1002/jmri.10154
Martín-Isla, C., Campello, V. M., Izquierdo, C., Kushibar, K., Sendra-Balcells, C., Gkontra, P., ... & Lekadir, K. (2023). Deep learning segmentation of the right ventricle in cardiac MRI: the M&Ms challenge. IEEE Journal of Biomedical and Health Informatics, 27(7), 3302-3313. https://doi.org/10.1109/JBHI.2023.3267857
Chen, C., Qin, C., Qiu, H., Tarroni, G., Duan, J., Bai, W., & Rueckert, D. (2020). Deep learning for cardiac image segmentation: a review. Frontiers in cardiovascular medicine, 7, 25. https://doi.org/10.3389/fcvm.2020.00025
Dataset collection: https://www.kaggle.com/datasets/danialsharifrazi/cad-cardiac-mri-dataset
Zhang, Y., Feng, J., Guo, X., & Ren, Y. (2022). Comparative analysis of U-Net and TLMDB GAN for the cardiovascular segmentation of the ventricles in the heart. Computer Methods and Programs in Biomedicine, 215, 106614. https://doi.org/10.1016/j.cmpb.2021.106614
Wong, K. K., Zhang, A., Yang, K., Wu, S., & Ghista, D. N. (2022). GCW-UNet segmentation of cardiac magnetic resonance images for evaluation of left atrial enlargement. Computer Methods and Programs in Biomedicine, 221, 106915. https://doi.org/10.1016/j.cmpb.2022.106915
Morris, E. D., Ghanem, A. I., Dong, M., Pantelic, M. V., Walker, E. M., & Glide‐Hurst, C. K. (2020). Cardiac substructure segmentation with deep learning for improved cardiac sparing. Medical physics, 47(2), 576-586. https://doi.org/10.1002/mp.13940
Mortazi, A., Karim, R., Rhode, K., Burt, J., & Bagci, U. (2017, September). CardiacNET: Segmentation of left atrium and proximal pulmonary veins from MRI using multi-view CNN. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 377-385). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-66185-8_43
Zhang, X., Noga, M., & Punithakumar, K. (2020). Fully automated deep learning based segmentation of normal, infarcted and edema regions from multiple cardiac MRI sequences. In Myocardial pathology segmentation combining multi-sequence CMR challenge (pp. 82-91). Cham: Springer International Publishing. https://doi.org/10.48550/arXiv.2008.07770
Ahmad, I., Qayyum, A., Gupta, B. B., Alassafi, M. O., & AlGhamdi, R. A. (2022). Ensemble of 2D residual neural networks integrated with atrous spatial pyramid pooling module for myocardium segmentation of left ventricle cardiac MRI. Mathematics, 10(4), 627. https://doi.org/10.3390/math10040627
Romaguera, L. V., Costa, M. G. F., Romero, F. P., & Costa Filho, C. F. F. (2017, March). Left ventricle segmentation in cardiac MRI images using fully convolutional neural networks. In Medical Imaging 2017: Computer-Aided Diagnosis (Vol. 10134, pp. 760-770). SPIE. https://doi.org/10.1117/12.2253901
Ammari, A., Mahmoudi, R., Hmida, B., Saouli, R., & Bedoui, M. H. (2021). A review of approaches investigated for right ventricular segmentation using short‐axis cardiac MRI. IET Image Processing, 15(9), 1845-1868. https://doi.org/10.1049/ipr2.12165
Zhang, T., Li, A., Wang, M., Wu, X., & Qiu, B. (2019). Multiple attention fully convolutional network for automated ventricle segmentation in cardiac magnetic resonance imaging. Journal of Medical Imaging and Health Informatics, 9(5), 1037-1045. https://doi.org/10.1166/jmihi.2019.2685
Kravchenko, D., Isaak, A., Mesropyan, N., Peeters, J. M., Kuetting, D., Pieper, C. C., ... & Luetkens, J. A. (2025). Deep learning super-resolution reconstruction for fast and high-quality cine cardiovascular magnetic resonance. European Radiology, 35(5), 2877-2887. https://doi.org/10.1007/s00330-024-11145-0
Jiang, C., Wang, Y., Yuan, Q., Qu, P., & Li, H. (2025). A 3D medical image segmentation network based on gated attention blocks and dual-scale cross-attention mechanism. Scientific Reports, 15(1), 6159. https://doi.org/10.1038/s41598-025-90339-y
Lu, Y., Zhao, Y., Chen, X., & Guo, X. (2022). A Novel U‐Net Based Deep Learning Method for 3D Cardiovascular MRI Segmentation. Computational Intelligence and Neuroscience, 2022(1), 4103524. https://doi.org/10.1155/2022/4103524
Suganyadevi, S., Pershiya, A. S., Balasamy, K., et al. “Deep learning based alzheimer disease diagnosis: A comprehensive review”. SN Computer Science, Vol.5 no.4, pp.391, 2024, https://doi.org/10.1007/s42979-024-02743-2.
Balasamy, K., Krishnaraj, N., & Vijayalakshmi, K. “An adaptive neuro-fuzzy based region selection and authenticating medical image through watermarking for secure communication”, Wireless Personal Communications, Vol.122, no.3, pp. 2817–2837, 2021. https://doi.org/10.1007/s11277-021-09031-9.
Suganyadevi, S., & Seethalakshmi, V. “CVD-HNet: Classifying Pneumonia and COVID-19 in Chest X-ray Images Using Deep Network”. Wireless Personal Communications, Vol.126, no. 4, pp.3279–3303, 2022. https://doi.org/10.1007/s11277-022-09864-y.
Balasamy, K., & Suganyadevi, S. “Multi-dimensional fuzzy based diabetic retinopathy detection in retinal images through deep CNN method”. Multimedia Tools and Applications, Vol 83, no. 5, pp.1–23. 2024. https://doi.org/10.1007/s11042-024-19798-1.
Shamia, D., Balasamy, K., and Suganyadevi, S. “A secure framework for medical image by integrating watermarking and encryption through fuzzy based roi selection”, Journal of Intelligent & Fuzzy systems, 2023, Vol. 44, no.5, pp.7449-7457. https://doi.org/10.3233/JIFS-222618.
Balasamy, K., Seethalakshmi, V. & Suganyadevi, S. Medical Image Analysis Through Deep Learning Techniques: A Comprehensive Survey. Wireless Pers Commun 137, 1685–1714 (2024). https://doi.org/10.1007/s11277-024-11428-1.
Suganyadevi, S., Seethalakshmi, V. Deep recurrent learning based qualified sequence segment analytical model (QS2AM) for infectious disease detection using CT images. Evolving Systems 15, 505–521 (2024). https://doi.org/10.1007/s12530-023-09554-5.
T. Gopalakrishnan, S. Ramakrishnan, K. Balasamy and A. S. Muthananda Murugavel, "Semi fragile watermarking using Gaussian mixture model for malicious image attacks," 2011 World Congress on Information and Communication Technologies, Mumbai, India, 2011, pp. 120-125. https://doi.org/10.1109/WICT.2011.6141229.
Renuka Devi, K., Suganyadevi, S and Balasamy, K.. “Healthcare Data Analysis U sing Deep Learning Paradigm ”. Deep Learning for Cognitive Computing Systems: Technological Advancements and Applications, edited by M.G. Sumithra, Rajesh Kumar Dhanaraj, Celestine Iwendi and Anto Merline Manoharan, Berlin, Boston:De Gruyter, 2023, pp. 129–148. https://doi.org/10.1515/9783110750584-008.
M. Hossin, M.N. Sulaiman, A review on evaluation metrics for data classification evaluations, Int. J. Data Min. Knowl. Manag. Process. 5 (2) (2015) 1. https://doi.org/10.5121/ijdkp.2015.5201.
Liu, Yu, Gabriella Captur, James C. Moon, Shuxu Guo, Xiaoping Yang, Shaoxiang Zhang, and Chunming Li. "Distance regularized two level sets for segmentation of left and right ventricles from cine-MRI." Magnetic resonance imaging 34, no. 5 (2016): 699-706, https://doi.org/10.1016/j.mri.2015.12.027
Queirós, Sandro, Daniel Barbosa, Brecht Heyde, Pedro Morais, João L. Vilaça, Denis Friboulet, Olivier Bernard, and Jan D’hooge. "Fast automatic myocardial segmentation in 4D cine CMR datasets." Medical image analysis 18, no. 7 (2014): 1115-1131, https://doi.org/10.1016/j.media.2014.06.001
Hu, Huaifei, Haihua Liu, Zhiyong Gao, and Lu Huang. "Hybrid segmentation of left ventricle in cardiac MRI using gaussian-mixture model and region restricted dynamic programming." Magnetic resonance imaging 31, no. 4 (2013): 575-584, https://doi.org/10.1016/j.mri.2012.10.004
Ringenberg, Jordan, Makarand Deo, Vijay Devabhaktuni, Omer Berenfeld, Pamela Boyers, and Jeffrey Gold. "Fast, accurate, and fully automatic segmentation of the right ventricle in short-axis cardiac MRI." Computerized Medical Imaging and Graphics 38, no. 3 (2014): 190-201, https://doi.org/10.1016/j.compmedimag.2013.12.011
Rosen, Boaz D., Thor Edvardsen, Shenghan Lai, Ernesto Castillo, Li Pan, Michael Jerosch-Herold, Shantanu Sinha et al. "Left ventricular concentric remodeling is associated with decreased global and regional systolic function: the Multi-Ethnic Study of Atherosclerosis." Circulation 112, no. 7 (2005): 984-991, https://doi.org/10.1161/CIRCULATIONAHA.104.500488
Qian, Xiaohua, Yuan Lin, Yue Zhao, Jing Wang, Jing Liu, and Xiahai Zhuang. "Segmentation of myocardium from cardiac MR images using a novel dynamic programming based segmentation method." Medical physics 42, no. 3 (2015): 1424-1435, https://doi.org/10.1118/1.4907993
Zhang, Hongyang, Wenxue Zhang, Weihao Shen, Nana Li, Yunjie Chen, Shuo Li, Bo Chen, Shijie Guo, and Yuanquan Wang. "Automatic segmentation of the cardiac MR images based on nested fully convolutional dense network with dilated convolution." Biomedical signal processing and control 68 (2021): 102684, https://doi.org/10.1016/j.bspc.2021.102684
Abdeltawab, Hisham, Fahmi Khalifa, Fatma Taher, Norah Saleh Alghamdi, Mohammed Ghazal, Garth Beache, Tamer Mohamed, Robert Keynton, and Ayman El-Baz. "A deep learning-based approach for automatic segmentation and quantification of the left ventricle from cardiac cine MR images." Computerized medical imaging and graphics 81 (2020): 101717, https://doi.org/10.1016/j.compmedimag.2020.101717
Fu, Fan, Jianyong Wei, Miao Zhang, Fan Yu, Yueting Xiao, Dongdong Rong, Yi Shan et al. "Rapid vessel segmentation and reconstruction of head and neck angiograms using 3D convolutional neural network." Nature communications 11, no. 1 (2020): 4829, https://doi.org/10.1038/s41467-020-18606-2
Fu, F., J. Wei, M. Zhang, F. Yu, Y. Xiao, D. Rong, Y. Shan et al. Rapid vessel segmentation and reconstruction of head and neck angiograms using 3D convolutional neural network. Nat Commun 11, 4829. 2020. https://doi.org/10.1038/s41467-020-18606-2
Copyright (c) 2026 Naga Chandrika Gogulamudi, Shamia D, V Kavithamani, Amitha Ida Chandran, K Venu, Kunchanapalli Rama Krishna

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).


.png)
.png)
.png)
.png)
.png)