MK–TripNet: A Deep Learning Framework for Real-Time Multi-Class Lung Sound Classification
Abstract
Respiratory diseases such as asthma, pneumonia, and Chronic Obstructive Pulmonary Disease (COPD) remain major global health challenges, particularly in resource-limited settings where access to pulmonary specialists and early diagnostic tools is limited. Automatic lung sound classifications have emerged as a promising non-invasive screening approach; however, existing methods often rely on single-scale feature extraction, conventional loss functions, and offline analysis, which limit their discriminative capability and real-time applicability. The aim of this study is to develop and evaluate a deep learning framework for real-time multi-class lung sound classifications that improves discriminative representation and temporal sensitivity. To address limitations, this study proposes MK-TripNet, a novel deep learning architecture designed to integrate multi-scale feature extraction, discriminative embedding learning, and real-time inference within a unified framework. The main contribution of this work is the unified integration of a Multi-Kernel convolutional architecture, Triplet Loss-based embedding learning, and Sliding Window segmentation within a single end-to-end framework, enabling accurate segment-level lung sound classifications in real-time scenarios. Unlike prior approaches, the proposed method simultaneously captures fine-grained temporal patterns and broader spectral characteristics while explicitly maximizing inter-class separability in the embedding space. The proposed model was evaluated using a newly constructed dataset comprising 1,409 lung sound segments obtained from primary digital stethoscope recordings and publicly available respiratory sound databases. Experimental results demonstrate that MK-TripNet consistently outperforms several strong baseline models, including CNN-BiGRU, CNN-BiGRU-UMAP, and VGGish-Triplet, achieving an accuracy of 89.1%, an F1-score of 0.89, and a recall of 0.88. Ablation studies further confirm that the combined use of Multi-Kernel convolution, Triplet Loss, and Sliding Window segmentation yields the most robust and generalizable performances. These findings highlight the clinical potential of MK-TripNet for real-time digital auscultation and point-of-care respiratory screening, particularly in resource-limited and telemedicine settings.
Downloads
References
Á. Troncoso, J. A. Ortega, R. Seepold, and N. M. Madrid, “Non-invasive devices for respiratory sound monitoring,” Procedia Comput Sci, vol. 192, pp. 3040–3048, 2021, doi: https://doi.org/10.1016/j.procs.2021.09.076.
S. Ahamed Fayaz et al., “Machine learning algorithms to predict treatment success for patients with pulmonary tuberculosis,” PLoS One, vol. 19, no. 10, p. e0309151, 2024, doi: https://doi.org/10.1371/journal.pone.0309151.
A. Roy, B. Gyanchandani, A. Oza, and A. Singh, “TriSpectraKAN: a novel approach for COPD detection via lung sound analysis,” Sci Rep, vol. 15, no. 1, p. 6296, 2025, doi: https://doi.org/10.1038/s41598-024-82781-1.
H. Duan et al., “Global, Regional, and National Burden Trends in Chronic Obstructive Pulmonary Disease Attributable to Particulate Matter Pollution: 1990–2021 and Projections to 2036,” Int J Chron Obstruct Pulmon Dis, pp. 2671–2683, 2025, doi: https://doi.org/10.2147/COPD.S527263.
D.-M. Huang, J. Huang, K. Qiao, N.-S. Zhong, H.-Z. Lu, and W.-J. Wang, “Deep learning-based lung sound analysis for intelligent stethoscope,” Mil Med Res, vol. 10, no. 1, p. 44, 2023, doi: https://doi.org/10.1186/s40779-023-00479-3.
L. Hakkı and G. Serbes, “Detection of Wheeze Sounds in Respiratory Disorders: A Deep Learning Approach,” International Advanced Researches and Engineering Journal, vol. 8, no. 1, pp. 20–32, Apr. 2024, doi: 10.35860/iarej.1402462.
M. Fraiwan, L. Fraiwan, B. Khassawneh, and A. Ibnian, “A dataset of lung sounds recorded from the chest wall using an electronic stethoscope,” Data Brief, vol. 35, p. 106913, 2021, doi: https://doi.org/10.1016/j.dib.2021.106913.
S. Escobar-Pajoy and J. P. Ugarte, “Computerized analysis of pulmonary sounds using uniform manifold projection,” Chaos Solitons Fractals, vol. 166, p. 112930, 2023, doi: https://doi.org/10.1016/j.chaos.2022.112930.
A. H. Sfayyih, N. Sulaiman, and A. H. Sabry, “A review on lung disease recognition by acoustic signal analysis with deep learning networks,” J Big Data, vol. 10, no. 1, p. 101, 2023, doi: https://doi.org/10.1186/s40537-023-00762-z.
R. Zulfiqar, F. Majeed, R. Irfan, H. T. Rauf, E. Benkhelifa, and A. N. Belkacem, “Abnormal respiratory sounds classification using deep CNN through artificial noise addition,” Front Med (Lausanne), vol. 8, p. 714811, 2021, doi: https://doi.org/10.3389/fmed.2021.714811.
G. Petmezas et al., “Automated lung sound classification using a hybrid CNN-LSTM network and focal loss function,” Sensors, vol. 22, no. 3, p. 1232, 2022, doi: https://doi.org/10.3390/s22031232.
F.-S. Hsu et al., “A dual-purpose deep learning model for auscultated lung and tracheal sound analysis based on mixed set training,” Biomed Signal Process Control, vol. 86, p. 105222, 2023, doi: https://doi.org/10.1016/j.bspc.2023.105222.
R. Khan, S. U. Khan, U. Saeed, and I.-S. Koo, “Auscultation-based pulmonary disease detection through parallel transformation and deep learning,” Bioengineering, vol. 11, no. 6, p. 586, 2024, doi: https://doi.org/10.3390/bioengineering11060586.
P. Duangmanee et al., “Triplet Multi-Kernel CNN for Detection of Pulmonary Diseases From Lung Sound Signals,” IEEE Access, 2025, doi: http://dx.doi.org/10.1109/ACCESS.2025.3552108.
S. A. Shehab, K. K. Mohammed, A. Darwish, and A. E. Hassanien, “Deep learning and feature fusion-based lung sound recognition model to diagnoses the respiratory diseases,” Soft comput, vol. 28, no. 19, pp. 11667–11683, 2024, doi: https://doi.org/10.1007/s00500-024-09866-x.
T. Wanasinghe, S. Bandara, S. Madusanka, D. Meedeniya, M. Bandara, and I. D. L. T. Díez, “Lung sound classification with multi-feature integration utilizing lightweight CNN model,” IEEE Access, vol. 12, pp. 21262–21276, 2024, doi: https://doi.org/10.1109/ACCESS.2024.3361943.
T. Nguyen and F. Pernkopf, “Crackle detection in lung sounds using transfer learning and multi-input convolutional neural networks,” in 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, 2021, pp. 80–83. doi: https://doi.org/10.1109/EMBC46164.2021.9630577.
Y. Kim, K. B. Kim, A. Y. Leem, K. Kim, and S. H. Lee, “Enhanced Respiratory Sound Classification Using Deep Learning and Multi-Channel Auscultation,” J Clin Med, vol. 14, no. 15, p. 5437, 2025, doi: https://doi.org/10.3390/jcm14155437.
J. Li et al., “LungAttn: advanced lung sound classification using attention mechanism with dual TQWT and triple STFT spectrogram,” Physiol Meas, vol. 42, no. 10, p. 105006, 2021, doi: https://doi.org/10.3390/arm93050032.
S. B. Shuvo, S. N. Ali, S. I. Swapnil, T. Hasan, and M. I. H. Bhuiyan, “A lightweight CNN model for detecting respiratory diseases from lung auscultation sounds using EMD-CWT-based hybrid scalogram,” IEEE J Biomed Health Inform, vol. 25, no. 7, pp. 2595–2603, 2020, doi: https://doi.org/10.1109/jbhi.2020.3048006.
F.-S. Hsu, S.-R. Huang, C.-W. Huang, C.-C. Chen, Y.-R. Cheng, and F. Lai, “Multi-path Convolutional Neural Networks Efficiently Improve Feature Extraction in Continuous Adventitious Lung Sound Detection,” arXiv preprint arXiv:2107.04226, 2021, doi: https://doi.org/10.48550/arXiv.2107.04226.
N. Fraihi, O. Karrakchou, and M. Ghogho, “Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism,” arXiv preprint arXiv:2507.20052, 2025, doi: https://doi.org/10.48550/arXiv.2507.20052.
L. Pham, D. Ngo, A. Schindler, and R. King, “A Deep Neural Network with Triplet Loss for Detecting Anomaly of Respiratory Sounds” in Proc. DAGA 2021 (47th Annual Meeting for Acoustics), Vienna, Austria, 2021, ISBN: 978-3-939296-18-8
S. Abhishek, A. J. Ananthapadmanabhan, T. Anjali, S. Reyma, A. Perathur, and R. B. Bentov, “Multimodal Integration of an Enhanced Novel Pulmonary Auscultation Real-Time Diagnostic System,” IEEE MultiMedia, vol. 31, no. 3, pp. 18–43, 2024, doi: https://doi.org/10.1109/MMUL.2024.3422022.
A. Sadeghzadeh and M. B. Islam, “Triplet loss-based convolutional neural network for static sign language recognition,” in 2022 Innovations in Intelligent Systems and Applications Conference (ASYU), IEEE, 2022, pp. 1–6. doi: https://doi.org/10.1109/ASYU56188.2022.9925490.
Y. Kim et al., “Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning,” Sci Rep, vol. 11, no. 1, p. 17186, 2021, doi: https://doi.org/10.1038/s41598-021-96724-7.
E. Messner et al., “Multi-channel lung sound classification with convolutional recurrent neural networks,” Comput Biol Med, vol. 122, p. 103831, 2020, doi: https://doi.org/10.1016/j.compbiomed.2020.103831.
S. KV, D. Koppad, P. Kumar, N. A. Kantikar, and S. Ramesh, “Multi-Task Learning for Lung sound & Lung disease classification,” arXiv preprint arXiv:2404.03908, 2024, doi: https://doi.org/10.48550/arXiv.2404.03908.
F. Wang, X. Yuan, Y. Liu, and C.-T. Lam, “LungNeXt: A novel lightweight network utilizing enhanced mel-spectrogram for lung sound classification,” Journal of King Saud University - Computer and Information Sciences, vol. 36, no. 8, p. 102200, Oct. 2024, doi: 10.1016/j.jksuci.2024.102200.
J. Park, C. Jeong, Y. Choi, H. Hong, and Y. Jo, “Lung Sound Classification Model for On-Device AI,” Applied Sciences, vol. 15, no. 17, p. 9361, 2025, doi: https://doi.org/10.3390/app15179361.
Y. Zhang, Q. Huang, W. Sun, F. Chen, D. Lin, and F. Chen, “Research on lung sound classification model based on dual-channel CNN-LSTM algorithm,” Biomed Signal Process Control, vol. 94, p. 106257, 2024, doi: https://doi.org/10.1016/j.bspc.2024.106257.
Z. Wang and Z. Sun, “Performance evaluation of lung sounds classification using deep learning under variable parameters,” EURASIP J Adv Signal Process, vol. 2024, no. 1, p. 51, 2024, doi: https://doi.org/10.1186/s13634-024-01148-w.
A. Mallol-Ragolta, M. Milling, and B. Schuller, “Multi-triplet loss-based models for categorical depression recognition from speech,” in Proceedings of the 7th Iberian Speech and Language Technologies Conference, 2024, pp. 31–35. doi: 10.21437/IberSPEECH.2024-7.
A. Shevchyk, R. Hu, K. Thandiackal, M. Heizmann, and T. Brunschwiler, “Privacy preserving synthetic respiratory sounds for class incremental learning,” Smart Health, vol. 23, p. 100232, 2022, doi: https://doi.org/10.1016/j.smhl.2021.100232.
B. Shen, M. Zhang, L. Yao, and Z. Song, “Novel triplet loss-based domain generalization network for bearing fault diagnosis with unseen load condition,” Processes, vol. 12, no. 5, p. 882, 2024, doi: https://doi.org/10.3390/pr12050882.
T. R. Ornob, G. Roy, and E. Hassan, “CovidExpert: A Triplet Siamese Neural Network framework for the detection of COVID-19,” Inform Med Unlocked, vol. 37, p. 101156, 2023, doi: https://doi.org/10.1016/j.imu.2022.101156.
Y. Kono et al., “Breath measurement method for synchronized reproduction of biological tones in an augmented reality auscultation training system,” Sensors, vol. 24, no. 5, p. 1626, 2024, doi: https://doi.org/10.3390/s24051626.
Y. Torabi, S. Shirani, and J. P. Reilly, “Descriptor: Heart and Lung Sounds Dataset Recorded from a Clinical Manikin using Digital Stethoscope (HLS-CMDS),” IEEE Data Descriptions, 2025, doi:https://doi.org/10.1109/IEEEDATA.2025.3566012.
A. H. Sabry, O. I. Dallal Bashi, N. H. Nik Ali, and Y. Mahmood Al Kubaisi, “Lung disease recognition methods using audio-based analysis with machine learning,” Feb. 29, 2024, Elsevier Ltd. doi: https://doi.org/10.1016/j.heliyon.2024.e26218.
P. Kapetanidis et al., “Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review,” Sensors, vol. 24, no. 4, Feb. 2024, doi: https://doi.org/10.3390/s24041173.
Copyright (c) 2026 Widya Surya Erini, Gracia Putri Thomas, Giulia Salzano Badia, Arief Rahadian, Sofyan Budi Raharjo, Sari Ayu Wulandari

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).


.png)
.png)
.png)
.png)
.png)