MK–TripNet: A Deep Learning Framework for Real-Time Multi-Class Lung Sound Classification

Widya Surya Erini; Gracia Putri Thomas; Giulia Salzano Badia; Arief Rahadian; Sofyan Budi Raharjo; Sari Ayu Wulandari

doi:10.35882/jeeemi.v8i2.1403

Widya Surya Erini Department of Biomedical Engineering, Universitas Dian Nuswantoro, Semarang, Central Java, Indonesia https://orcid.org/0009-0004-7625-0273
Gracia Putri Thomas Department of Biomedical Engineering, Universitas Dian Nuswantoro, Semarang, Central Java, Indonesia https://orcid.org/0009-0004-5490-3499
Giulia Salzano Badia Department of Biomedical Engineering, Universitas Dian Nuswantoro, Semarang, Central Java, Indonesia https://orcid.org/0009-0003-6855-0228
Arief Rahadian Department of Medicine, Universitas Dian Nuswantoro, Semarang, Central Java, Indonesia https://orcid.org/0000-0002-8251-3752
Sofyan Budi Raharjo Department of Internal Medicine, Dr. Kariadi General Hospital, Semarang, Central Java, Indonesia https://orcid.org/0000-0003-1367-316X
Sari Ayu Wulandari Department of Biomedical Engineering, Universitas Dian Nuswantoro, Semarang, Central Java, Indonesia https://orcid.org/0000-0002-6287-4403

DOI: https://doi.org/10.35882/jeeemi.v8i2.1403

Keywords: Multi Kernel, Triplet Loss, Sliding Window, CNN, MFCC

Abstract

Respiratory diseases such as asthma, pneumonia, and Chronic Obstructive Pulmonary Disease (COPD) remain major global health challenges, particularly in resource-limited settings where access to pulmonary specialists and early diagnostic tools is limited. Automatic lung sound classifications have emerged as a promising non-invasive screening approach; however, existing methods often rely on single-scale feature extraction, conventional loss functions, and offline analysis, which limit their discriminative capability and real-time applicability. The aim of this study is to develop and evaluate a deep learning framework for real-time multi-class lung sound classifications that improves discriminative representation and temporal sensitivity. To address limitations, this study proposes MK-TripNet, a novel deep learning architecture designed to integrate multi-scale feature extraction, discriminative embedding learning, and real-time inference within a unified framework. The main contribution of this work is the unified integration of a Multi-Kernel convolutional architecture, Triplet Loss-based embedding learning, and Sliding Window segmentation within a single end-to-end framework, enabling accurate segment-level lung sound classifications in real-time scenarios. Unlike prior approaches, the proposed method simultaneously captures fine-grained temporal patterns and broader spectral characteristics while explicitly maximizing inter-class separability in the embedding space. The proposed model was evaluated using a newly constructed dataset comprising 1,409 lung sound segments obtained from primary digital stethoscope recordings and publicly available respiratory sound databases. Experimental results demonstrate that MK-TripNet consistently outperforms several strong baseline models, including CNN-BiGRU, CNN-BiGRU-UMAP, and VGGish-Triplet, achieving an accuracy of 89.1%, an F1-score of 0.89, and a recall of 0.88. Ablation studies further confirm that the combined use of Multi-Kernel convolution, Triplet Loss, and Sliding Window segmentation yields the most robust and generalizable performances. These findings highlight the clinical potential of MK-TripNet for real-time digital auscultation and point-of-care respiratory screening, particularly in resource-limited and telemedicine settings.

Downloads

Download data is not yet available.

References

Á. Troncoso, J. A. Ortega, R. Seepold, and N. M. Madrid, “Non-invasive devices for respiratory sound monitoring,” Procedia Comput Sci, vol. 192, pp. 3040–3048, 2021, doi: https://doi.org/10.1016/j.procs.2021.09.076.

S. Ahamed Fayaz et al., “Machine learning algorithms to predict treatment success for patients with pulmonary tuberculosis,” PLoS One, vol. 19, no. 10, p. e0309151, 2024, doi: https://doi.org/10.1371/journal.pone.0309151.

A. Roy, B. Gyanchandani, A. Oza, and A. Singh, “TriSpectraKAN: a novel approach for COPD detection via lung sound analysis,” Sci Rep, vol. 15, no. 1, p. 6296, 2025, doi: https://doi.org/10.1038/s41598-024-82781-1.

H. Duan et al., “Global, Regional, and National Burden Trends in Chronic Obstructive Pulmonary Disease Attributable to Particulate Matter Pollution: 1990–2021 and Projections to 2036,” Int J Chron Obstruct Pulmon Dis, pp. 2671–2683, 2025, doi: https://doi.org/10.2147/COPD.S527263.

D.-M. Huang, J. Huang, K. Qiao, N.-S. Zhong, H.-Z. Lu, and W.-J. Wang, “Deep learning-based lung sound analysis for intelligent stethoscope,” Mil Med Res, vol. 10, no. 1, p. 44, 2023, doi: https://doi.org/10.1186/s40779-023-00479-3.

L. Hakkı and G. Serbes, “Detection of Wheeze Sounds in Respiratory Disorders: A Deep Learning Approach,” International Advanced Researches and Engineering Journal, vol. 8, no. 1, pp. 20–32, Apr. 2024, doi: 10.35860/iarej.1402462.

M. Fraiwan, L. Fraiwan, B. Khassawneh, and A. Ibnian, “A dataset of lung sounds recorded from the chest wall using an electronic stethoscope,” Data Brief, vol. 35, p. 106913, 2021, doi: https://doi.org/10.1016/j.dib.2021.106913.

S. Escobar-Pajoy and J. P. Ugarte, “Computerized analysis of pulmonary sounds using uniform manifold projection,” Chaos Solitons Fractals, vol. 166, p. 112930, 2023, doi: https://doi.org/10.1016/j.chaos.2022.112930.

A. H. Sfayyih, N. Sulaiman, and A. H. Sabry, “A review on lung disease recognition by acoustic signal analysis with deep learning networks,” J Big Data, vol. 10, no. 1, p. 101, 2023, doi: https://doi.org/10.1186/s40537-023-00762-z.

R. Zulfiqar, F. Majeed, R. Irfan, H. T. Rauf, E. Benkhelifa, and A. N. Belkacem, “Abnormal respiratory sounds classification using deep CNN through artificial noise addition,” Front Med (Lausanne), vol. 8, p. 714811, 2021, doi: https://doi.org/10.3389/fmed.2021.714811.

G. Petmezas et al., “Automated lung sound classification using a hybrid CNN-LSTM network and focal loss function,” Sensors, vol. 22, no. 3, p. 1232, 2022, doi: https://doi.org/10.3390/s22031232.

F.-S. Hsu et al., “A dual-purpose deep learning model for auscultated lung and tracheal sound analysis based on mixed set training,” Biomed Signal Process Control, vol. 86, p. 105222, 2023, doi: https://doi.org/10.1016/j.bspc.2023.105222.

R. Khan, S. U. Khan, U. Saeed, and I.-S. Koo, “Auscultation-based pulmonary disease detection through parallel transformation and deep learning,” Bioengineering, vol. 11, no. 6, p. 586, 2024, doi: https://doi.org/10.3390/bioengineering11060586.

P. Duangmanee et al., “Triplet Multi-Kernel CNN for Detection of Pulmonary Diseases From Lung Sound Signals,” IEEE Access, 2025, doi: http://dx.doi.org/10.1109/ACCESS.2025.3552108.

S. A. Shehab, K. K. Mohammed, A. Darwish, and A. E. Hassanien, “Deep learning and feature fusion-based lung sound recognition model to diagnoses the respiratory diseases,” Soft comput, vol. 28, no. 19, pp. 11667–11683, 2024, doi: https://doi.org/10.1007/s00500-024-09866-x.

T. Wanasinghe, S. Bandara, S. Madusanka, D. Meedeniya, M. Bandara, and I. D. L. T. Díez, “Lung sound classification with multi-feature integration utilizing lightweight CNN model,” IEEE Access, vol. 12, pp. 21262–21276, 2024, doi: https://doi.org/10.1109/ACCESS.2024.3361943.

T. Nguyen and F. Pernkopf, “Crackle detection in lung sounds using transfer learning and multi-input convolutional neural networks,” in 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, 2021, pp. 80–83. doi: https://doi.org/10.1109/EMBC46164.2021.9630577.

Y. Kim, K. B. Kim, A. Y. Leem, K. Kim, and S. H. Lee, “Enhanced Respiratory Sound Classification Using Deep Learning and Multi-Channel Auscultation,” J Clin Med, vol. 14, no. 15, p. 5437, 2025, doi: https://doi.org/10.3390/jcm14155437.

J. Li et al., “LungAttn: advanced lung sound classification using attention mechanism with dual TQWT and triple STFT spectrogram,” Physiol Meas, vol. 42, no. 10, p. 105006, 2021, doi: https://doi.org/10.3390/arm93050032.

S. B. Shuvo, S. N. Ali, S. I. Swapnil, T. Hasan, and M. I. H. Bhuiyan, “A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram,” IEEE J Biomed Health Inform, vol. 25, no. 7, pp. 2595–2603, 2020, doi: https://doi.org/10.1109/jbhi.2020.3048006.

F.-S. Hsu, S.-R. Huang, C.-W. Huang, C.-C. Chen, Y.-R. Cheng, and F. Lai, “Multi-path Convolutional Neural Networks Efficiently Improve Feature Extraction in Continuous Adventitious Lung Sound Detection,” arXiv preprint arXiv:2107.04226, 2021, doi: https://doi.org/10.48550/arXiv.2107.04226.

N. Fraihi, O. Karrakchou, and M. Ghogho, “Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism,” arXiv preprint arXiv:2507.20052, 2025, doi: https://doi.org/10.48550/arXiv.2507.20052.

L. Pham, D. Ngo, A. Schindler, and R. King, “A Deep Neural Network with Triplet Loss for Detecting Anomaly of Respiratory Sounds” in Proc. DAGA 2021 (47th Annual Meeting for Acoustics), Vienna, Austria, 2021, ISBN: 978-3-939296-18-8

S. Abhishek, A. J. Ananthapadmanabhan, T. Anjali, S. Reyma, A. Perathur, and R. B. Bentov, “Multimodal Integration of an Enhanced Novel Pulmonary Auscultation Real-Time Diagnostic System,” IEEE MultiMedia, vol. 31, no. 3, pp. 18–43, 2024, doi: https://doi.org/10.1109/MMUL.2024.3422022.

A. Sadeghzadeh and M. B. Islam, “Triplet loss-based convolutional neural network for static sign language recognition,” in 2022 Innovations in Intelligent Systems and Applications Conference (ASYU), IEEE, 2022, pp. 1–6. doi: https://doi.org/10.1109/ASYU56188.2022.9925490.

Y. Kim et al., “Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning,” Sci Rep, vol. 11, no. 1, p. 17186, 2021, doi: https://doi.org/10.1038/s41598-021-96724-7.

E. Messner et al., “Multi-channel lung sound classification with convolutional recurrent neural networks,” Comput Biol Med, vol. 122, p. 103831, 2020, doi: https://doi.org/10.1016/j.compbiomed.2020.103831.

S. KV, D. Koppad, P. Kumar, N. A. Kantikar, and S. Ramesh, “Multi-Task Learning for Lung sound & Lung disease classification,” arXiv preprint arXiv:2404.03908, 2024, doi: https://doi.org/10.48550/arXiv.2404.03908.

F. Wang, X. Yuan, Y. Liu, and C.-T. Lam, “LungNeXt: A novel lightweight network utilizing enhanced mel-spectrogram for lung sound classification,” Journal of King Saud University - Computer and Information Sciences, vol. 36, no. 8, p. 102200, Oct. 2024, doi: 10.1016/j.jksuci.2024.102200.

J. Park, C. Jeong, Y. Choi, H. Hong, and Y. Jo, “Lung Sound Classification Model for On-Device AI,” Applied Sciences, vol. 15, no. 17, p. 9361, 2025, doi: https://doi.org/10.3390/app15179361.

Y. Zhang, Q. Huang, W. Sun, F. Chen, D. Lin, and F. Chen, “Research on lung sound classification model based on dual-channel CNN-LSTM algorithm,” Biomed Signal Process Control, vol. 94, p. 106257, 2024, doi: https://doi.org/10.1016/j.bspc.2024.106257.

Z. Wang and Z. Sun, “Performance evaluation of lung sounds classification using deep learning under variable parameters,” EURASIP J Adv Signal Process, vol. 2024, no. 1, p. 51, 2024, doi: https://doi.org/10.1186/s13634-024-01148-w.

A. Mallol-Ragolta, M. Milling, and B. Schuller, “Multi-triplet loss-based models for categorical depression recognition from speech,” in Proceedings of the 7th Iberian Speech and Language Technologies Conference, 2024, pp. 31–35. doi: 10.21437/IberSPEECH.2024-7.

A. Shevchyk, R. Hu, K. Thandiackal, M. Heizmann, and T. Brunschwiler, “Privacy preserving synthetic respiratory sounds for class incremental learning,” Smart Health, vol. 23, p. 100232, 2022, doi: https://doi.org/10.1016/j.smhl.2021.100232.

B. Shen, M. Zhang, L. Yao, and Z. Song, “Novel triplet loss-based domain generalization network for bearing fault diagnosis with unseen load condition,” Processes, vol. 12, no. 5, p. 882, 2024, doi: https://doi.org/10.3390/pr12050882.

T. R. Ornob, G. Roy, and E. Hassan, “CovidExpert: A Triplet Siamese Neural Network framework for the detection of COVID-19,” Inform Med Unlocked, vol. 37, p. 101156, 2023, doi: https://doi.org/10.1016/j.imu.2022.101156.

Y. Kono et al., “Breath measurement method for synchronized reproduction of biological tones in an augmented reality auscultation training system,” Sensors, vol. 24, no. 5, p. 1626, 2024, doi: https://doi.org/10.3390/s24051626.

Y. Torabi, S. Shirani, and J. P. Reilly, “Descriptor: Heart and Lung Sounds Dataset Recorded from a Clinical Manikin using Digital Stethoscope (HLS-CMDS),” IEEE Data Descriptions, 2025, doi:https://doi.org/10.1109/IEEEDATA.2025.3566012.

A. H. Sabry, O. I. Dallal Bashi, N. H. Nik Ali, and Y. Mahmood Al Kubaisi, “Lung disease recognition methods using audio-based analysis with machine learning,” Feb. 29, 2024, Elsevier Ltd. doi: https://doi.org/10.1016/j.heliyon.2024.e26218.

P. Kapetanidis et al., “Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review,” Sensors, vol. 24, no. 4, Feb. 2024, doi: https://doi.org/10.3390/s24041173.