A Comparative Study of Machine Learning Methods for Baby Cry Detection Using MFCC Features

Putri Agustina Riadi; Mohammad Reza Faisal; Dwi Kartini; Radityo Adi Nugroho; Dodon Turianto Nugrahadi; Dike Bayu Magfira

doi:10.35882/jeeemi.v6i1.350

Putri Agustina Riadi Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0009-0003-6585-1842
Mohammad Reza Faisal Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0001-5748-7639
Dwi Kartini Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-7382-5084
Radityo Adi Nugroho Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-7326-7668
Dodon Turianto Nugrahadi Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0001-7746-2658
Dike Bayu Magfira Information System Department, Universitas Nahdlatul Ulama Surabaya, Surabaya, Indonesia https://orcid.org/0000-0002-7829-9183

DOI: https://doi.org/10.35882/jeeemi.v6i1.350

Keywords: Baby cry detection, Spectrogram, MFCC, machine learning

Abstract

The vocalization of infants, commonly known as baby crying, represents one of the primary means by which infants effectively communicate their needs and emotional states to adults. While the act of crying can yield crucial insights into the well-being and comfort of a baby, there exists a dearth of research specifically investigating the influence of the audio range within a baby cry on research outcomes. The core problem of research is the lack of research on the influence of audio range on baby cry classification on machine learning. The purpose of this study is to ascertain the impact of the duration of an infant’s cry on the outcomes of machine learning classification and to gain knowledge regarding the accuracy of results F1 score obtained through the utilization of the machine learning method. The contribution is to enrich an understanding of the application of classification and feature selection in audio datasets, particulary in the context of baby cry audio. The utilized dataset, known as donate-a-cry-corpus, encompasses five distinct data classes and possesses a duration of seven seconds. The employed methodology consists of the spectrogram technique, cross-validation for data partitioning, MFCC feature extraction with 10, 20, and 30 coefficients, as well as machine learning models including Support Vector Machine, Random Forest, and Naïve Bayes. The findings of this study reveal that the Random Forest model achieved an accuracy of 0.844 and an F1 score of 0.773 when 10 MFCC coefficients were utilized and the optimal audio range was set at six seconds. Furthermore, the Support Vector Machine model with an RBF kernel yielded an accuracy of 0.836 and an F1 score of 0.761, while the Naïve Bayes model achieved an accuracy 0.538 and F1 score of 0.539. Notably, no discernible differences were observed when evaluating the Support Vector Machine and Naïve Bayes methods across the 1-7 second time trial. The implication of this research is to establish a foundation for the advancement of premature illness identification techniques grounded in the vocalizations of infants, thereby facilitating swifter diagnostic processes for pediatric practitioners.

Downloads

Download data is not yet available.

References

C. Ji, T. B. Mudiyanselage, Y. Gao, and Y. Pan, “A review of infant cry analysis and classification,” Eurasip J. Audio, Speech, Music Process., vol. 2021, no. 1, 2021, doi: 10.1186/s13636-021-00197-5.

S. Mishra, “Artificial Intelligence: A Review of Progress and Prospects in Medicine and Healthcare,” J. Electron. Electromed. Eng. Med. Informatics, vol. 4, no. 1, pp. 1–23, 2022, doi: 10.35882/jeeemi.v4i1.1.

D. F. Sengkey and A. S. R. Masengi, “Regression Algorithms in Predicting the SARS-CoV-2 Replicase Polyprotein 1ab Inhibitor: A Comparative Study,” J. Electron. Electromed. Eng. Med. Informatics, vol. 6, no. 1, pp. 1–10, 2024, doi: 10.35882/JEEEMI.V6I1.338.

K. Sharma, C. Gupta, and S. Gupta, “Infant Weeping Calls Decoder using Statistical Feature Extraction and Gaussian Mixture Models,” 2019 10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, pp. 1–6, 2019, doi: 10.1109/ICCCNT45670.2019.8944527.

F. Anders, M. Hlawitschka, and M. Fuchs, “Automatic classification of infant vocalization sequences with convolutional neural networks,” Speech Commun., vol. 119, no. October 2019, pp. 36–45, 2020, doi: 10.1016/j.specom.2020.03.003.

P. Sandhya, V. Spoorthy, S. G. Koolagudi, and N. V. Sobhana, “Spectral Features for Emotional Speaker Recognition,” Proc. 2020 3rd Int. Conf. Adv. Electron. Comput. Commun. ICAECC 2020, 2020, doi: 10.1109/ICAECC50550.2020.9339502.

D. Berrar, “Cross-validation,” Encycl. Bioinforma. Comput. Biol. ABC Bioinforma., vol. 1–3, no. January 2018, pp. 542–545, 2018, doi: 10.1016/B978-0-12-809633-8.20349-X.

L. Le, A. N. M. H. Kabir, C. Ji, S. Basodi, and Y. Pan, “Using Transfer Learning, SVM, and Ensemble Classification to Classify Baby Cries Based on Their Spectrogram Images,” Proc. - 2019 IEEE 16th Int. Conf. Mob. Ad Hoc Smart Syst. Work. MASSW 2019, pp. 106–110, 2019, doi: 10.1109/MASSW.2019.00028.

F. Salehian Matikolaie and C. Tadj, “On the use of long-term features in a newborn cry diagnostic system,” Biomed. Signal Process. Control, vol. 59, p. 101889, 2020, doi: 10.1016/j.bspc.2020.101889.

M. M. Mafazy, “Classification of COVID-19 Cough Sounds using Mel Frequency Cepstral Coefficient ( MFCC ) Feature Extraction and Support Vector Machine Telematika Classification of COVID-19 Cough Sounds using Mel Frequency Cepstral Coefficient ( MFCC ) Feature Extraction,” no. August, 2023, doi: 10.35671/telematika.v16i2.2569.

S. Mishra, “A Comparative Study for Time-to-Event Analysis and Survival Prediction for Heart Failure Condition using Machine Learning Techniques,” J. Electron. Electromed. Eng. Med. Informatics, vol. 4, no. 3, pp. 115–134, 2022, doi: 10.35882/jeeemi.v4i3.225.

L. T. Sunil Sharma, “Journal of Electronics, Electromedical Engineering, and Medical Informatics,” pp. 62–69, 2022.

P. Kulkarni, S. Umarani, V. Diwan, V. Korde, and P. P. Rege, “Child Cry Classification - An Analysis of Features and Models,” 2021 6th Int. Conf. Converg. Technol. I2CT 2021, pp. 1–7, 2021, doi: 10.1109/I2CT51068.2021.9418129.

A. Ekİncİ and E. Küçükkülahli, “Classification of Baby Cries Using Machine Learning Algorithms,” vol. IX, no. I, pp. 16–26, 2023.

I. Södergren, M. P. Nodeh, P. C. Chhipa, K. Nikolaidou, and G. Kovács, “Detecting COVID-19 from audio recording of coughs using Random Forests and Support Vector Machines,” Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, vol. 6, no. November, pp. 4256–4260, 2021, doi: 10.21437/Interspeech.2021-2191.

K. Rezaee, H. Ghayoumi Zadeh, L. Qi, H. Rabiee, and M. R. Khosravi, “Can you Understand why I am Crying? A Decision-making System for Classifying Infants’ Cry Languages Based on deepSVM Model,” ACM Trans. Asian Low-Resource Lang. Inf. Process., 2023, doi: 10.1145/3579032.

R. I. Tuduce, M. S. Rusu, H. Cucu, and C. Burileanu, “Automated baby cry classification on a hospital-acquired baby cry database,” 2019 42nd Int. Conf. Telecommun. Signal Process. TSP 2019, pp. 343–346, 2019, doi: 10.1109/TSP.2019.8769075.

G. Aggarwal, K. Jhajharia, J. Izhar, M. Kumar, and L. Abualigah, “A Machine Learning Approach to Classify Biomedical Acoustic Features for Baby Cries,” J. Voice, Jul. 2023, doi: 10.1016/J.JVOICE.2023.06.014.

M. R. Faisal et al., “LSTM and Bi-LSTM Models For Identifying Natural Disasters Reports From Social Media,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 4, pp. 241–249, 2023.

S. Aini, W. A. Kusuma, M. K. D. Hardhienata, and Mushthofa, “Network-Based Molecular Features Selection to Predict the Drug Synergy in Cancer Cells,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 3, pp. 168–176, 2023, [Online]. Available: https://jeeemi.org/index.php/jeeemi/article/view/307

V. Bansal, G. Pahwa, and N. Kannan, “Cough classification for COVID-19 based on audio mfcc features using convolutional neural networks,” 2020 IEEE Int. Conf. Comput. Power Commun. Technol. GUCON 2020, pp. 604–608, 2020, doi: 10.1109/GUCON48875.2020.9231094.

A. C. Kemila, W. Fawwaz, and A. Maki, “Parameter Optimization of Support Vector Machine using River Formation Dynamic on Brain Tumor Classification,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 3, pp. 177–184, 2023.

E. Sutanto, F. Fahmi, W. Shalannanda, and A. Aridarma, “Cry Recognition for Infant Incubator Monitoring System Based on Internet of Things using Machine Learning,” Int. J. Intell. Eng. Syst., vol. 14, no. 1, pp. 444–454, 2021, doi: 10.22266/IJIES2021.0228.41.

R. T. Yunardi, R. Apsari, and M. Yasin, “Comparison of Machine Learning Algorithm For Urine Glucose Level Classification Using Side-Polished Fiber Sensor,” J. Electron. Electromed. Eng. Med. Informatics, vol. 2, no. 2, pp. 33–39, 2020, doi: 10.35882/jeeemi.v2i2.1.

M. Anbu and G. S. Anandha Mala, “Feature selection using firefly algorithm in software defect prediction,” Cluster Comput., vol. 22, no. 4, pp. 10925–10934, 2019, doi: 10.1007/s10586-017-1235-3.

J. He, L. Yang, D. Liu, and Z. Song, “Automatic Recognition of High-Density Epileptic EEG Using Support Vector Machine and Gradient-Boosting Decision Tree,” Brain Sci., vol. 12, no. 9, 2022, doi: 10.3390/brainsci12091197.

M. I. Mazdadi, I. Budiman, and R. Herteno, “Implementation of Information Gain and Particle Swarm Optimization on Sentiment Analysis of Covid-19 Handling Using K-Nn,” J. Inform. dan Komputer) Accredit. KEMENDIKBUD RISTEK, vol. 6, no. 1, pp. 261–270, 2023, [Online]. Available: https://www.kaggle.com/dionisiusdh/covid19indonesi

M. R. Ansyari, M. I. Mazdadi, D. Kartini, and T. H. Saragih, “Implementation of Random Forest and Extreme Gradient Boosting in the Classification of Heart Disease Using Particle Swarm Optimization Feature Selection,” vol. 5, no. 4, 2023.

M. I. Mazdadi, A. Farmadi, and D. Kartini, “Implementation of Particle Swarm Optimization Feature Selection on Naïve Bayes for Thoracic Surgery Classification,” vol. 5, no. 3, pp. 150–158, 2023.

S. P. Dewi, A. L. Prasasti, and B. Irawan, “The Study of Baby Crying Analysis Using MFCC and LFCC in Different Classification Methods,” Proc. - 2019 IEEE Int. Conf. Signals Syst. ICSigSys 2019, pp. 18–23, 2019, doi: 10.1109/ICSIGSYS.2019.8811070.