Implementation of Information Gain Ratio and Particle Swarm Optimization in the Sentiment Analysis Classification of Covid-19 Vaccine Using Support Vector Machine

Keywords: Covid-19 vaccine, Information Gain Ratio, Particle Swarm Optimization, Support Vector Machine

Abstract

In the current digital era, sentiment analysis has become an effective method for identifying and interpreting public opinions on various topics, including public health issues such as COVID-19 vaccination. Vaccination is a crucial measure in tackling this pandemic, but there are still a number of people who are skeptical and reluctant to receive the COVID-19 vaccine. This public perception is largely influenced by, including information received from social media and online platforms. Therefore, sentiment analysis of the COVID-19 vaccine is one way to understand the public's perception of the COVID-19 vaccine. This research has the purpose to enhance the classification performance in sentiment analysis of COVID-19 vaccines by implementing Information Gain Ratio (IGR) and Particle Swarm Optimization (PSO) on the Support Vector Machine (SVM). With a dataset of 2000 entries consisting of 1000 positive labels and 1000 negative labels, validation was performed through a combination of data splitting with an 80:20 ratio and stratified 10-Fold cross-validation. Applying the basic SVM, an accuracy of 0.794 and an AUC value of 0.890 were obtained. Integration with Information Gain Ratio (IGR) feature selection improved the accuracy to 0.814 and an AUC of 0.907. Furthermore, through the combination of SVM based on PSO and IGR, the accuracy significantly improved to 0.837 with an AUC of 0.913. These results demonstrate that the combination of feature selection techniques and parameter optimization can enhance the performance of sentiment classification towards COVID-19 vaccines. The conclusions drawn from this research indicate that the integration of IGR and PSO positively contributes to the effectiveness and predictive capability of the SVM model in sentiment classification tasks.

Downloads

Download data is not yet available.

References

M. Nur Yasir Utomo and E. Tungadi, “Prosiding 6 th Seminar Nasional Penelitian & Pengabdian Kepada Masyarakat 2022 Bidang Ilmu Teknik Elektro,” 2022.

B. Pamungkas, M. E. Purbaya, and D. J. A.K, “Analisis Sentimen Twitter Menggunakan Metode Support Vector Machine (SVM) pada Kasus Benih Lobster 2020,” J. Informatics, Inf. Syst. Softw. Eng. Appl., vol. 3, no. 2, pp. 10–20, 2020.

V. K. S. Que, Ade Iriani, and Hindriyanto Dwi Purnomo, “Analisis Sentimen Transportasi Online Menggunakan Support Vector Machine Berbasis Particle Swarm Optimization (Online Transportation Sentiment Analysis Using Support Vector Machine Based on Particle Swarm Optimization),” 2020.

F. N. Fajriyan, M. Ahsan, and W. Harianto, “Komparasi Tingkat Akurasi Information Gain Dan Gain Ratio Pada Metode K-Nearest Neighbor,” JATI (Jurnal Mhs. Tek. Inform., vol. 6, no. 1, pp. 386–391, 2022, doi: 10.36040/jati.v6i1.4694.

I. Maulida, A. Suyatno, and H. R. Hatta, “Seleksi Fitur Pada Dokumen Abstrak Teks Bahasa Indonesia Menggunakan Metode Information Gain,” J. SIFO Mikroskil, vol. 17, no. 2, pp. 249–258, 2016, doi: 10.55601/jsm.v17i2.379.

P. P. R., V. M.L., and S. S., “Gain Ratio Based Feature Selection Method for Privacy Preservation,” ICTACT J. Soft Comput., vol. 01, no. 04, pp. 201–205, 2011, doi: 10.21917/ijsc.2011.0031.

K. Kurniabudi, A. Harris, and A. E. Mintaria, “Komparasi Information Gain, Gain Ratio, CFs-Bestfirst dan CFs-PSO Search Terhadap Performa Deteksi Anomali,” J. Media Inform. Budidarma, vol. 5, no. 1, p. 332, 2021, doi: 10.30865/mib.v5i1.2258.

Y. R. Nugraha, A. P. Wibawa, and I. A. E. Zaeni, “Particle Swarm Optimization-Support Vector Machine (PSO-SVM) Algorithm for Journal Rank Classification,” Proc. - 2019 2nd Int. Conf. Comput. Informatics Eng. Artif. Intell. Roles Ind. Revolut. 4.0, IC2IE 2019, pp. 69–73, 2019, doi: 10.1109/IC2IE47452.2019.8940822.

T. Wen and Z. Zhang, “Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification,” Med. (United States), vol. 96, no. 19, pp. 1–11, 2017, doi: 10.1097/MD.0000000000006879.

M. Ramya and J. A. Pinakas, “Different Type of Feature Selection for Text Classification,” Int. J. Comput. Trends Technol., vol. 10, pp. 102–107, Apr. 2014, doi: 10.14445/22312803/IJCTT-V10P118.

R. Al Habsi, R. Agsar Dwi Anggoro, M. Arlanda Valio, Y. Widiastiwi, and N. Chamidah, Analisis Sentimen Terhadap Vaksin Covid-19 di Jejaring Sosial Twitter Menggunakan Algoritma Naïve Bayes. 2021.

E. Odhiambo Omuya, G. Onyango Okeyo, and M. Waema Kimwele, “Feature Selection for Classification using Principal Component Analysis and Information Gain,” Expert Syst. Appl., vol. 174, no. January, p. 114765, 2021, doi: 10.1016/j.eswa.2021.114765.

A. F. Rahman, “Analisis Sentimen Penggunaan Tol Trans Jawa Periode Mudik Lebaran 2019 dengan Metode K-Nearest Neighbor dan Seleksi Fitur Information Gain,” vol. 4, no. 6, pp. 1675–1682, 2020.

M. Shakil Pervez and D. Md. Farid, “Literature Review of Feature Selection for Mining Tasks,” Int. J. Comput. Appl., vol. 116, no. 21, pp. 30–33, 2015, doi: 10.5120/20462-2829.

R.-H. Dong, H.-H. Yan, and Q.-Y. Zhang, “An Intrusion Detection Model for Wireless Sensor Network Based on Information Gain Ratio and Bagging Algorithm,” Int. J. Netw. Secur., vol. 22, no. 2, pp. 218–230, 2020, doi: 10.6633/IJNS.202003.

A. H. Mohammad, “Comparing two feature selections methods (Information gain and gain ratio) on three different classification algorithms using arabic dataset.,” J. Theor. Appl. Inf. Technol., vol. 96, no. 6, pp. 1561–1569, 2018.

L. Gunawan, M. S. Anggreainy, L. Wihan, Santy, G. Y. Lesmana, and S. Yusuf, “Support vector machine based emotional analysis of restaurant reviews,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 479–484, 2022, doi: 10.1016/j.procs.2022.12.160.

F. Rahutomo, P. Y. Saputra, and M. A. Fidyawan, “IMPLEMENTASI TWITTER SENTIMENT ANALYSIS UNTUK REVIEW FILM MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE,” vol. 4, pp. 93–100, 2018.

A. P. Gopi, R. N. S. Jyothi, V. L. Narayana, and K. S. Sandeep, “Classification of tweets data based on polarity using improved RBF kernel of SVM,” Int. J. Inf. Technol., vol. 15, no. 2, pp. 965–980, 2023, doi: 10.1007/s41870-019-00409-4.

J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, no. xxxx, pp. 189–215, 2020, doi: 10.1016/j.neucom.2019.10.118.

M. Y. Cho and T. T. Hoang, “Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems,” Comput. Intell. Neurosci., vol. 2017, 2017, doi: 10.1155/2017/4135465.

D. J. Kalita and S. Singh, “SVM Hyper-parameters optimization using quantized multi-PSO in dynamic environment,” Soft Comput., vol. 24, no. 2, pp. 1225–1241, 2020, doi: 10.1007/s00500-019-03957-w.

R. Indraswari and A. Z. Arifin, “RBF KERNEL OPTIMIZATION METHOD WITH PARTICLE SWARM OPTIMIZATION ON SVM USING THE ANALYSIS OF INPUT DATA’S MOVEMENT,” J. Ilmu Komput. dan Inf., vol. 10, no. 1, p. 36, 2017, doi: 10.21609/jiki.v10i1.410.

M. R. A.-G. Ahmed and A. M. Abdalla, “Enhancing Hybrid Intrusion Detection and Prevention System for Flooding Attacks Using Decision Tree,” in International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), 2019, no. September, pp. 1–4.

U. Ahmad, H. Asim, M. T. Hassan, and S. Naseer, “Analysis of Classification Techniques for Intrusion Detection,” 3rd Int. Conf. Innov. Comput. ICIC 2019, no. Icic, 2019, doi: 10.1109/ICIC48496.2019.8966675.

A. A. Salih and A. M. Abdulazeez, “Evaluation of Classification Algorithms for Intrusion Detection System: A Review,” J. Soft Comput. Data Min., vol. 02, no. 01, pp. 31–40, 2021, doi: 10.30880/jscdm.2021.02.01.004.

T. Saranya, S. Sridevi, C. Deisy, T. D. Chung, and M. K. A. A. Khan, “Performance Analysis of Machine Learning Algorithms in Intrusion Detection System: A Review,” Procedia Comput. Sci., vol. 171, no. 2019, pp. 1251–1260, 2020, doi: 10.1016/j.procs.2020.04.133.

M. Navim and R. Pankaja, “Performance Analysis of Text Classification Algorithms using Confusion Matrix,” Int. J. Eng. Tech. Res. IJETR, vol. 6, no. 4, pp. 75–78, 2016.

J. Hernández-Orallo, “ROC curves for regression,” Pattern Recognit., vol. 46, no. 12, pp. 3395–3411, 2013, doi: 10.1016/j.patcog.2013.06.014.

D. Lin, L. Sun, K. A. Toh, J. B. Zhang, and Z. Lin, “Twin SVM with a reject option through ROC curve,” J. Franklin Inst., vol. 355, no. 4, pp. 1710–1732, 2018, doi: 10.1016/j.jfranklin.2017.05.003.

C. Y. Lee and W. C. Lin, “Induction Motor Fault Classification Based on ROC Curve and t-SNE,” IEEE Access, vol. 9, pp. 56330–56343, 2021, doi: 10.1109/ACCESS.2021.3072646.

F. Gorunescu, Data Mining : Concepts, Models and Techniques. Berlin: Germany: Springer-Verlag Berlin Heidelberg, 2011.

Published
2023-09-24
How to Cite
[1]
Muhamad Fawwaz Akbar, Muhammad Itqan Mazdadi, Muliadi, Triando Hamonangan Saragih, and Friska Abadi, “Implementation of Information Gain Ratio and Particle Swarm Optimization in the Sentiment Analysis Classification of Covid-19 Vaccine Using Support Vector Machine”, j.electron.electromedical.eng.med.inform, vol. 5, no. 4, pp. 261-270, Sep. 2023.
Section
Electronics