A Comparative Study of Machine Learning Methods for Baby Cry Detection Using MFCC Features
Abstract
The vocalization of infants, commonly known as baby crying, represents one of the primary means by which infants effectively communicate their needs and emotional states to adults. While the act of crying can yield crucial insights into the well-being and comfort of a baby, there exists a dearth of research specifically investigating the influence of the audio range within a baby cry on research outcomes. The core problem of research is the lack of research on the influence of audio range on baby cry classification on machine learning. The purpose of this study is to ascertain the impact of the duration of an infant’s cry on the outcomes of machine learning classification and to gain knowledge regarding the accuracy of results F1 score obtained through the utilization of the machine learning method. The contribution is to enrich an understanding of the application of classification and feature selection in audio datasets, particulary in the context of baby cry audio. The utilized dataset, known as donate-a-cry-corpus, encompasses five distinct data classes and possesses a duration of seven seconds. The employed methodology consists of the spectrogram technique, cross-validation for data partitioning, MFCC feature extraction with 10, 20, and 30 coefficients, as well as machine learning models including Support Vector Machine, Random Forest, and Naïve Bayes. The findings of this study reveal that the Random Forest model achieved an accuracy of 0.844 and an F1 score of 0.773 when 10 MFCC coefficients were utilized and the optimal audio range was set at six seconds. Furthermore, the Support Vector Machine model with an RBF kernel yielded an accuracy of 0.836 and an F1 score of 0.761, while the Naïve Bayes model achieved an accuracy 0.538 and F1 score of 0.539. Notably, no discernible differences were observed when evaluating the Support Vector Machine and Naïve Bayes methods across the 1-7 second time trial. The implication of this research is to establish a foundation for the advancement of premature illness identification techniques grounded in the vocalizations of infants, thereby facilitating swifter diagnostic processes for pediatric practitioners.
Downloads
References
C. Ji, T. B. Mudiyanselage, Y. Gao, and Y. Pan, “A review of infant cry analysis and classification,” Eurasip J. Audio, Speech, Music Process., vol. 2021, no. 1, 2021, doi: 10.1186/s13636-021-00197-5.
S. Mishra, “Artificial Intelligence: A Review of Progress and Prospects in Medicine and Healthcare,” J. Electron. Electromed. Eng. Med. Informatics, vol. 4, no. 1, pp. 1–23, 2022, doi: 10.35882/jeeemi.v4i1.1.
D. F. Sengkey and A. S. R. Masengi, “Regression Algorithms in Predicting the SARS-CoV-2 Replicase Polyprotein 1ab Inhibitor: A Comparative Study,” J. Electron. Electromed. Eng. Med. Informatics, vol. 6, no. 1, pp. 1–10, 2024, doi: 10.35882/JEEEMI.V6I1.338.
K. Sharma, C. Gupta, and S. Gupta, “Infant Weeping Calls Decoder using Statistical Feature Extraction and Gaussian Mixture Models,” 2019 10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, pp. 1–6, 2019, doi: 10.1109/ICCCNT45670.2019.8944527.
F. Anders, M. Hlawitschka, and M. Fuchs, “Automatic classification of infant vocalization sequences with convolutional neural networks,” Speech Commun., vol. 119, no. October 2019, pp. 36–45, 2020, doi: 10.1016/j.specom.2020.03.003.
P. Sandhya, V. Spoorthy, S. G. Koolagudi, and N. V. Sobhana, “Spectral Features for Emotional Speaker Recognition,” Proc. 2020 3rd Int. Conf. Adv. Electron. Comput. Commun. ICAECC 2020, 2020, doi: 10.1109/ICAECC50550.2020.9339502.
D. Berrar, “Cross-validation,” Encycl. Bioinforma. Comput. Biol. ABC Bioinforma., vol. 1–3, no. January 2018, pp. 542–545, 2018, doi: 10.1016/B978-0-12-809633-8.20349-X.
L. Le, A. N. M. H. Kabir, C. Ji, S. Basodi, and Y. Pan, “Using Transfer Learning, SVM, and Ensemble Classification to Classify Baby Cries Based on Their Spectrogram Images,” Proc. - 2019 IEEE 16th Int. Conf. Mob. Ad Hoc Smart Syst. Work. MASSW 2019, pp. 106–110, 2019, doi: 10.1109/MASSW.2019.00028.
F. Salehian Matikolaie and C. Tadj, “On the use of long-term features in a newborn cry diagnostic system,” Biomed. Signal Process. Control, vol. 59, p. 101889, 2020, doi: 10.1016/j.bspc.2020.101889.
M. M. Mafazy, “Classification of COVID-19 Cough Sounds using Mel Frequency Cepstral Coefficient ( MFCC ) Feature Extraction and Support Vector Machine Telematika Classification of COVID-19 Cough Sounds using Mel Frequency Cepstral Coefficient ( MFCC ) Feature Extraction,” no. August, 2023, doi: 10.35671/telematika.v16i2.2569.
S. Mishra, “A Comparative Study for Time-to-Event Analysis and Survival Prediction for Heart Failure Condition using Machine Learning Techniques,” J. Electron. Electromed. Eng. Med. Informatics, vol. 4, no. 3, pp. 115–134, 2022, doi: 10.35882/jeeemi.v4i3.225.
L. T. Sunil Sharma, “Journal of Electronics, Electromedical Engineering, and Medical Informatics,” pp. 62–69, 2022.
P. Kulkarni, S. Umarani, V. Diwan, V. Korde, and P. P. Rege, “Child Cry Classification - An Analysis of Features and Models,” 2021 6th Int. Conf. Converg. Technol. I2CT 2021, pp. 1–7, 2021, doi: 10.1109/I2CT51068.2021.9418129.
A. Ekİncİ and E. Küçükkülahli, “Classification of Baby Cries Using Machine Learning Algorithms,” vol. IX, no. I, pp. 16–26, 2023.
I. Södergren, M. P. Nodeh, P. C. Chhipa, K. Nikolaidou, and G. Kovács, “Detecting COVID-19 from audio recording of coughs using Random Forests and Support Vector Machines,” Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, vol. 6, no. November, pp. 4256–4260, 2021, doi: 10.21437/Interspeech.2021-2191.
K. Rezaee, H. Ghayoumi Zadeh, L. Qi, H. Rabiee, and M. R. Khosravi, “Can you Understand why I am Crying? A Decision-making System for Classifying Infants’ Cry Languages Based on deepSVM Model,” ACM Trans. Asian Low-Resource Lang. Inf. Process., 2023, doi: 10.1145/3579032.
R. I. Tuduce, M. S. Rusu, H. Cucu, and C. Burileanu, “Automated baby cry classification on a hospital-acquired baby cry database,” 2019 42nd Int. Conf. Telecommun. Signal Process. TSP 2019, pp. 343–346, 2019, doi: 10.1109/TSP.2019.8769075.
G. Aggarwal, K. Jhajharia, J. Izhar, M. Kumar, and L. Abualigah, “A Machine Learning Approach to Classify Biomedical Acoustic Features for Baby Cries,” J. Voice, Jul. 2023, doi: 10.1016/J.JVOICE.2023.06.014.
M. R. Faisal et al., “LSTM and Bi-LSTM Models For Identifying Natural Disasters Reports From Social Media,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 4, pp. 241–249, 2023.
S. Aini, W. A. Kusuma, M. K. D. Hardhienata, and Mushthofa, “Network-Based Molecular Features Selection to Predict the Drug Synergy in Cancer Cells,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 3, pp. 168–176, 2023, [Online]. Available: https://jeeemi.org/index.php/jeeemi/article/view/307
V. Bansal, G. Pahwa, and N. Kannan, “Cough classification for COVID-19 based on audio mfcc features using convolutional neural networks,” 2020 IEEE Int. Conf. Comput. Power Commun. Technol. GUCON 2020, pp. 604–608, 2020, doi: 10.1109/GUCON48875.2020.9231094.
A. C. Kemila, W. Fawwaz, and A. Maki, “Parameter Optimization of Support Vector Machine using River Formation Dynamic on Brain Tumor Classification,” J. Electron. Electromed. Eng. Med. Informatics, vol. 5, no. 3, pp. 177–184, 2023.
E. Sutanto, F. Fahmi, W. Shalannanda, and A. Aridarma, “Cry Recognition for Infant Incubator Monitoring System Based on Internet of Things using Machine Learning,” Int. J. Intell. Eng. Syst., vol. 14, no. 1, pp. 444–454, 2021, doi: 10.22266/IJIES2021.0228.41.
R. T. Yunardi, R. Apsari, and M. Yasin, “Comparison of Machine Learning Algorithm For Urine Glucose Level Classification Using Side-Polished Fiber Sensor,” J. Electron. Electromed. Eng. Med. Informatics, vol. 2, no. 2, pp. 33–39, 2020, doi: 10.35882/jeeemi.v2i2.1.
M. Anbu and G. S. Anandha Mala, “Feature selection using firefly algorithm in software defect prediction,” Cluster Comput., vol. 22, no. 4, pp. 10925–10934, 2019, doi: 10.1007/s10586-017-1235-3.
J. He, L. Yang, D. Liu, and Z. Song, “Automatic Recognition of High-Density Epileptic EEG Using Support Vector Machine and Gradient-Boosting Decision Tree,” Brain Sci., vol. 12, no. 9, 2022, doi: 10.3390/brainsci12091197.
M. I. Mazdadi, I. Budiman, and R. Herteno, “Implementation of Information Gain and Particle Swarm Optimization on Sentiment Analysis of Covid-19 Handling Using K-Nn,” J. Inform. dan Komputer) Accredit. KEMENDIKBUD RISTEK, vol. 6, no. 1, pp. 261–270, 2023, [Online]. Available: https://www.kaggle.com/dionisiusdh/covid19indonesi
M. R. Ansyari, M. I. Mazdadi, D. Kartini, and T. H. Saragih, “Implementation of Random Forest and Extreme Gradient Boosting in the Classification of Heart Disease Using Particle Swarm Optimization Feature Selection,” vol. 5, no. 4, 2023.
M. I. Mazdadi, A. Farmadi, and D. Kartini, “Implementation of Particle Swarm Optimization Feature Selection on Naïve Bayes for Thoracic Surgery Classification,” vol. 5, no. 3, pp. 150–158, 2023.
S. P. Dewi, A. L. Prasasti, and B. Irawan, “The Study of Baby Crying Analysis Using MFCC and LFCC in Different Classification Methods,” Proc. - 2019 IEEE Int. Conf. Signals Syst. ICSigSys 2019, pp. 18–23, 2019, doi: 10.1109/ICSIGSYS.2019.8811070.
Copyright (c) 2024 Putri Agustina Riadi, Mohammad Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Dodon Turianto Nugrahadi, Dike Bayu Magfira
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).