LSTM and Bi-LSTM Models For Identifying Natural Disasters Reports From Social Media

Rahmi Yunida; Mohammad Reza Faisal; Muliadi; Fatma Indriani; Friska  Abadi; Irwan Budiman; Septyan Eka Prastya

doi:10.35882/jeeemi.v5i4.319

Rahmi Yunida Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0009-0007-3265-6046
Mohammad Reza Faisal Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0001-5748-7639
Muliadi Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0003-2871-9482
Fatma Indriani Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0009-0006-7180-6708
Friska Abadi Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-9449-8000
Irwan Budiman Computer Science Department, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-0514-7429
Septyan Eka Prastya Information Technolgy Department, Sari Mulia University, Banjarmasin, South Kalimantan, Indonesia https://orcid.org/0000-0001-6836-5514

DOI: https://doi.org/10.35882/jeeemi.v5i4.319

Keywords: Bi-LSTM, earthquake, LSTM, natural disaster, word embedding

Abstract

Natural disaster events are occurrences that cause significant losses, primarily resulting in environmental and property damage and in the worst cases, even loss of life. In some cases of natural disasters, social media has been utilized as the fastest information bridge to inform many people, especially through platforms like Twitter. To provide accurate categorization of information, the field of text mining can be leveraged. This study implements a combination of the word2vec and LSTM methods and the combination of word2vec and Bi-LSTM to determine which method is the most accurate for use in the case study of news related to disaster events. The utility of word2vec lies in its feature extraction method, transforming textual data into vector form for processing in the classification stage. On the other hand, the LSTM and Bi-LSTM methods are used as classification techniques to categorize the vectorized data resulting from the extraction process. The experimental results show an accuracy of 70.67% for the combination of word2vec and LSTM and an accuracy of 72.17% for the combination of word2vec and Bi-LSTM. This indicates an improvement of 1.5% achieved by combining the word2vec and Bi-LSTM methods. This research is significant in identifying the comparative performance of each combination method, word2vec + LSTM and word2vec + Bi-LSTM, to determine the best-performing combination in the process of classifying data related to earthquake natural disasters. The study also offers insights into various parameters present in the word2vec, LSTM, and Bi-LSTM methods that researchers can determine.

Downloads

Download data is not yet available.

References

M. Akter, D. Cumming, and S. Ji, “Natural disasters and market manipulation,” J. Bank. Financ., vol. 153, p. 106883, 2023, doi: 10.1016/j.jbankfin.2023.106883.

P. Daly, S. Ninglekhu, P. Hollenbach, J. W. McCaughey, D. Lallemant, and B. P. Horton, “Rebuilding historic urban neighborhoods after disasters: Balancing disaster risk reduction and heritage conservation after the 2015 earthquakes in Nepal,” Int. J. Disaster Risk Reduct., vol. 86, no. January, p. 103564, 2023, doi: 10.1016/j.ijdrr.2023.103564.

M. R. Faisal, I. Budiman, F. Abadi, M. Haekal, M. K. Delimayanti, and D. T. Nugrahadi, “Using Social Media Data to Monitor Natural Disaster: A Multi Dimension Convolutional Neural Network Approach with Word Embedding,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 6, pp. 1037–1046, 2022, doi: 10.29207/resti.v6i6.4525.

K. Zahra, M. Imran, and F. O. Ostermann, “Automatic identification of eyewitness messages on twitter during disasters,” Inf. Process. Manag., vol. 57, no. 1, p. 102107, 2020, doi: 10.1016/j.ipm.2019.102107.

K. Y. Firlia, M. R. Faisal, D. Kartini, R. A. Nugroho, and F. Abadi, “Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages,” Proc. - 2021 4th Int. Conf. Comput. Informatics Eng. IT-Based Digit. Ind. Innov. Welf. Soc. IC2IE 2021, no. December, pp. 317–322, 2021, doi: 10.1109/IC2IE53219.2021.9649107.

M. Cormier and M. Cushman, “Innovation via social media – The importance of Twitter to science,” Res. Pract. Thromb. Haemost., vol. 5, no. 3, pp. 373–375, 2021, doi: 10.1002/rth2.12493.

C. J. Powers et al., “Using artificial intelligence to identify emergency messages on social media during a natural disaster: A deep learning approach,” Int. J. Inf. Manag. Data Insights, vol. 3, no. 1, p. 100164, 2023, doi: 10.1016/j.jjimei.2023.100164.

M. Dou, Y. Wang, Y. Gu, S. Dong, M. Qiao, and Y. Deng, “Disaster damage assessment based on fine-grained topics in social media,” Comput. Geosci., vol. 156, no. March, p. 104893, 2021, doi: 10.1016/j.cageo.2021.104893.

T. Georgieva-Trifonova and M. Dechev, “Applying text mining methods to extracting information from news articles,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1031, no. 1, 2021, doi: 10.1088/1757-899X/1031/1/012054.

P. Savci and B. Das, “Comparison of pre-trained language models in terms of carbon emissions, time and accuracy in multi-label text classification using AutoML,” Heliyon, vol. 9, no. 5, p. e15670, 2023, doi: 10.1016/j.heliyon.2023.e15670.

Nurmawiya and K. A. Harvian, “Public sentiment towards face-to-face activities during the COVID-19 pandemic in Indonesia,” Procedia Comput. Sci., vol. 197, no. 2021, pp. 529–537, 2021, doi: 10.1016/j.procs.2021.12.170.

S. Kim, H. Park, and J. Lee, “Word2vec-based latent semantic analysis (W2V-LSA) for topic modeling: A study on blockchain technology trend analysis,” Expert Syst. Appl., vol. 152, 2020, doi: 10.1016/j.eswa.2020.113401.

F. Xue, X. Li, T. Zhang, and N. Hu, “Stock market reactions to the COVID-19 pandemic: The moderating role of corporate big data strategies based on Word2Vec,” Pacific Basin Financ. J., vol. 68, no. November 2020, p. 101608, 2021, doi: 10.1016/j.pacfin.2021.101608.

R. Agrawal and R. Goyal, “Developing bug severity prediction models using word2vec,” Int. J. Cogn. Comput. Eng., vol. 2, no. July, pp. 104–115, 2021, doi: 10.1016/j.ijcce.2021.08.001.

A. Ajitha, M. Goel, M. Assudani, S. Radhika, and S. Goel, “Design and development of Residential Sector Load Prediction model during COVID-19 Pandemic using LSTM based RNN,” Electr. Power Syst. Res., vol. 212, no. October 2021, p. 108635, 2022, doi: 10.1016/j.epsr.2022.108635.

S. Song et al., “Research on a working face gas concentration prediction model based on LASSO-RNN time series data,” Heliyon, vol. 9, no. 4, 2023, doi: 10.1016/j.heliyon.2023.e14864.

M. Muñoz-Organero, P. Callejo, and M. Á. Hombrados-Herrera, “A new RNN based machine learning model to forecast COVID-19 incidence, enhanced by the use of mobility data from the bike-sharing service in Madrid,” Heliyon, vol. 9, no. 6, p. e17625, 2023, doi: 10.1016/j.heliyon.2023.e17625.

Q. Kang, E. J. Chen, Z.-C. Li, H.-B. Luo, and Y. Liu, “Attention-based LSTM predictive model for the attitude and position of shield machine in tunneling,” Undergr. Sp., vol. 13, pp. 335–350, 2023, doi: 10.1016/j.undsp.2023.05.006.

M. Khuntia and D. Gupta, “Indian News Headlines Classification using Word Embedding Techniques and LSTM Model,” Procedia Comput. Sci., vol. 218, pp. 899–907, 2023, doi: 10.1016/j.procs.2023.01.070.

Z. Sun, R. Machlev, Q. Wang, J. Belikov, Y. Levron, and D. Baimel, “A public data-set for synchronous motor electrical faults diagnosis with CNN and LSTM reference classifiers,” Energy AI, vol. 14, no. January, p. 100274, 2023, doi: 10.1016/j.egyai.2023.100274.

U. B. Mahadevaswamy and P. Swathi, “Sentiment Analysis using Bidirectional LSTM Network,” Procedia Comput. Sci., vol. 218, pp. 45–56, 2023, doi: 10.1016/j.procs.2022.12.400.

M. Arbane, R. Benlamri, Y. Brik, and A. D. Alahmar, “Social media-based COVID-19 sentiment classification model using Bi-LSTM,” Expert Syst. Appl., vol. 212, no. November 2021, p. 118710, 2023, doi: 10.1016/j.eswa.2022.118710.

E. M. Alshari, A. Azman, S. Doraisamy, N. Mustapha, and M. Alksher, “Senti2vec: An effective feature extraction technique for sentiment analysis based on word2vec,” Malaysian J. Comput. Sci., vol. 33, no. 3, pp. 240–251, 2020, doi: 10.22452/mjcs.vol33no3.5.

G. S. . Murthy, S. R. Allu, B. Andhavarapu, M. Bgadi, and M. Belusonti, “Text based Sentiment Analysis using Long Short Term Memory (LSTM),” Int. J. Eng. Res. Technol., vol. 9, no. 05, pp. 299–303, 2020.

G. Xu, Y. Meng, X. Qiu, Z. Yu, and X. Wu, “Sentiment analysis of comment texts based on BiLSTM,” IEEE Access, vol. 7, pp. 51522–51532, 2019, doi: 10.1109/ACCESS.2019.2909919.

P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, “Sentiment Analysis Using Word2vec and Long Short-Term Memory (LSTM) for Indonesian Hotel Reviews,” Procedia Comput. Sci., vol. 179, no. 2020, pp. 728–735, 2021, doi: 10.1016/j.procs.2021.01.061.

M. R. Faisal, I. Budiman, F. Abadi, D. T. Nugrahadi, M. Haekal, and I. Sutedja, “Applying Features Based on Word Embedding Techniques to 1D CNN for Natural Disaster Messages Classification,” 2022 5th Int. Conf. Comput. Informatics Eng. IC2IE 2022, no. December, pp. 192–197, 2022, doi: 10.1109/IC2IE56416.2022.9970188.

M. R. Faisal, I. Budiman, F. Abadi, M. Haekal, and D. T. Nugrahadi, “A comparison of word embedding-based extraction feature techniques and deep learning models of natural disaster messages classification,” J. Comput. Sci. Inst., vol. 27, no. December 2022, pp. 145–153, 2023, doi: 10.35784/jcsi.3322.

M. R. Faisal, R. A. Nugroho, R. Ramadhani, F. Abadi, R. Herteno, and T. H. Saragih, “Natural disaster on twitter: Role of feature extraction method of word2vec and lexicon based for determining direct eyewitness,” Trends Sci., vol. 18, no. 23, pp. 1–13, 2021, doi: 10.48048/tis.2021.680.

B. Jang, I. Kim, and J. W. Kim, “Word2vec convolutional neural networks for classification of news articles and tweets,” PLoS One, vol. 14, no. 8, pp. 1–20, 2019, doi: 10.1371/journal.pone.0220976.

D. Suhartono, K. Purwandari, N. H. Jeremy, S. Philip, P. Arisaputra, and I. H. Parmonangan, “Deep neural networks and weighted word embeddings for sentiment analysis of drug product reviews,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 664–671, 2023, doi: 10.1016/j.procs.2022.12.182.

D. Sunitha, R. K. Patra, N. V. Babu, A. Suresh, and S. C. Gupta, “Twitter sentiment analysis using ensemble based deep learning model towards COVID-19 in India and European countries,” Pattern Recognit. Lett., vol. 158, pp. 164–170, 2022, doi: 10.1016/j.patrec.2022.04.027.

V. Matoušek, “Application of LSTM Neural Networks in Language Modelling,” Univ. West Bohemia, Fac. Appl. Sci. Dep. Cybern. Univerzitn´ı 22, Plzen, Czech rep, no. June 2018, 2013, doi: 10.1007/978-3-642-40585-3.

V. R. Joseph and A. Vakayil, “SPlit: An Optimal Method for Data Splitting,” Technometrics, vol. 64, no. 2, pp. 166–176, 2022, doi: 10.1080/00401706.2021.1921037.

L. B. V. de Amorim, G. D. C. Cavalcanti, and R. M. O. Cruz, “The choice of scaling technique matters for classification performance,” Appl. Soft Comput., vol. 133, pp. 1–37, 2023, doi: 10.1016/j.asoc.2022.109924.