Sentiment Analysis of TikTok Shop Closure in Indonesia on Twitter Using Supervised Machine Learning

Noor Zalekha Al Habesyah; Rudy Herteno; Fatma Indriani; Irwan Budiman; Dwi Kartini

doi:10.35882/jeeemi.v6i2.381

Noor Zalekha Al Habesyah Department of Computer Science, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia
Rudy Herteno Department of Computer Science, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0003-0637-8090
Fatma Indriani Department of Computer Science, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0009-0006-7180-6708
Irwan Budiman Department of Computer Science, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-0514-7429
Dwi Kartini Department of Computer Science, Lambung Mangkurat University, Banjarbaru, South Kalimantan, Indonesia https://orcid.org/0000-0002-7382-5084

DOI: https://doi.org/10.35882/jeeemi.v6i2.381

Abstract

TikTok Shop is one of the features in TikTok application which facilitates users to buy and sell products. The integration of TikTok Shop with social media has provided new opportunities to reach customers and increase sales. However, the closure of TikTok Shop has caused controversy among the public. This study aims to analyze the views and responses of TikTok users in Indonesia to the closure of TikTok Shop. The dataset used was obtained from Twitter. The research methodology consists of labeling, oversampling, splitting, and machine learning, which includes SVM, Random Forest, Decision Tree, and Deep Learning (H2O). The contribution of this research enriches our understanding of the implementation of machine learning, especially in sentiment analysis of TikTok Shop closures. From the test results, it is known that Deep Learning (H2O) + SMOTE obtained AUC 0.900, without using SMOTE, AUC 0.867. SVM + SMOTE obtained AUC 0.885, without using SMOTE AUC 0.881. Random Forest + SMOTE obtained AUC 0.822, while without using SMOTE AUC 0.830. Decision Tree + SMOTE AUC 0.59; without SMOTE, AUC 0.646. Deep Learning (H2O) with SMOTE produces better performance compared to SVM, Random Forest, and Decision Tree. With an AUC of 0.900; it can be said that Deep Learning (H2O) has excellent performance for sentiment analysis of TikTok Shop closures. This research has significant implications for social electronic commerce due to its potential utilization by social media analysts.

Downloads

Download data is not yet available.

References

S. Hu, U. Akram, F. Ji, Y. Zhao, and J. Song, “Does social media usage contribute to cross-border social commerce? An empirical evidence from SEM and fsQCA analysis,” Acta Psychol (Amst), vol. 241, Nov. 2023, doi: 10.1016/j.actpsy.2023.104083.

M. Alviazra Virgananda, I. Budi, and R. Randy Suryono, “Purchase Intention and Sentiment Analysis on Twitter Related to Social Commerce.” [Online]. Available: www.ijacsa.thesai.org

P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Social Network Analysis and Mining, vol. 11, no. 1. Springer, Dec. 01, 2021. doi: 10.1007/s13278-021-00776-6.

A. Hasan, S. Moin, A. Karim, and S. Shamshirband, “Machine Learning-Based Sentiment Analysis for Twitter Accounts,” Mathematical and Computational Applications, vol. 23, no. 1, p. 11, Feb. 2018, doi: 10.3390/mca23010011.

O. Czeranowska et al., “Migrants vs. stayers in the pandemic – A sentiment analysis of Twitter content,” Telematics and Informatics Reports, vol. 10, Jun. 2023, doi: 10.1016/j.teler.2023.100059.

H. Cam, A. V. Cam, U. Demirel, and S. Ahmed, “Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers,” Heliyon, vol. 10, no. 1, p. e23784, Jan. 2024, doi: 10.1016/j.heliyon.2023.e23784.

V. Umarani, A. Julian, and J. Deepa, “Sentiment Analysis using various Machine Learning and Deep Learning Techniques,” Journal of the Nigerian Society of Physical Sciences, vol. 3, no. 4, pp. 385–394, Nov. 2021, doi: 10.46481/jnsps.2021.308.

A. Naik and L. Samant, “Correlation Review of Classification Algorithm Using Data Mining Tool: WEKA, Rapidminer, Tanagra, Orange and Knime,” in Procedia Computer Science, Elsevier B.V., 2016, pp. 662–668. doi: 10.1016/j.procs.2016.05.251.

A. T. Gurmu, A. Krezel, and M. N. Mahmood, “Analysis of the causes of defects in ground floor systems of residential buildings,” International Journal of Construction Management, vol. 23, no. 2, pp. 268–275, 2023, doi: 10.1080/15623599.2020.1860636.

M. Z. Naser, “Machine learning for all! Benchmarking automated, explainable, and coding-free platforms on civil and environmental engineering problems,” Journal of Infrastructure Intelligence and Resilience, vol. 2, no. 1, Mar. 2023, doi: 10.1016/j.iintel.2023.100028.

V. Kalra and R. Aggarwal, “Importance of Text Data Preprocessing & Implementation in RapidMiner,” in Proceedings of the First International Conference on Information Technology and Knowledge Management, PTI, Jan. 2018, pp. 71–75. doi: 10.15439/2017km46.

K. Purwandari, R. B. Perdana, J. W. C. Sigalingging, R. Rahutomo, and B. Pardamean, “Automatic Smart Crawling on Twitter for Weather Information in Indonesia,” Procedia Comput Sci, vol. 227, pp. 795–804, 2023, doi: 10.1016/j.procs.2023.10.585.

S. Leorna and T. Brinkman, “Human vs. machine: Detecting wildlife in camera trap images,” Ecol Inform, vol. 72, Dec. 2022, doi: 10.1016/j.ecoinf.2022.101876.

M. A. Jassim and S. N. Abdulwahid, “Data Mining preparation: Process, Techniques and Major Issues in Data Analysis,” IOP Conf Ser Mater Sci Eng, vol. 1090, no. 1, p. 012053, Mar. 2021, doi: 10.1088/1757-899x/1090/1/012053.

M. Siino, I. Tinnirello, and M. La Cascia, “Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers,” Inf Syst, vol. 121, Mar. 2024, doi: 10.1016/j.is.2023.102342.

Q. H. Nguyen et al., “Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,” Math Probl Eng, vol. 2021, 2021, doi: 10.1155/2021/4832864.

B. T. Pham et al., “A novel hybrid soft computing model using random forest and particle swarm optimization for estimation of undrained shear strength of soil,” Sustainability (Switzerland), vol. 12, no. 6, pp. 1–16, Mar. 2020, doi: 10.3390/su12062218.

X. Xiao et al., “Treatment initiation prediction by EHR mapped PPD tensor based convolutional neural networks boosting algorithm,” J Biomed Inform, vol. 120, Aug. 2021, doi: 10.1016/j.jbi.2021.103840.

Asniar, N. U. Maulidevi, and K. Surendro, “SMOTE-LOF for noise identification in imbalanced data classification,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 6, pp. 3413–3423, Jun. 2022, doi: 10.1016/j.jksuci.2021.01.014.

K. Suryadi, R. Herteno, S. Wahyu Saputro, M. Reza Faisal, R. Adi Nugroho, and M. Kevin Suryadi, “ShareAlike 4.0 International License (CC BY-SA 4.0). How to cite: Mulia Comparative Study of Various Hyperparameter Tuning on Random Forest Classification with SMOTE and Feature Selection Using Genetic Algorithm in Software Defect A Comparative Study of Various Hyperparameter Tuning on Random Forest Classification with SMOTE and Feature Selection Using Genetic Algorithm in Software Defect Prediction,” Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 6, no. 2, pp. 137–147, 2024, doi: 10.35882/jeeemi.v6i2.375.

C. Zhang, J. Song, Z. Pei, and J. Jiang, “An Imbalanced Data Classification Algorithm of De-noising Auto-Encoder Neural Network Based on SMOTE”, doi: 10.1051/conf/2016.

G. Baron and U. Stanczyk, “Standard vs. non-standard cross-validation: Evaluation of performance in a space with structured distribution of datapoints,” in Procedia Computer Science, Elsevier B.V., 2021, pp. 1245–1254. doi: 10.1016/j.procs.2021.08.128.

A. M. Peco Chacón, I. Segovia Ramírez, and F. P. García Márquez, “K-nearest neighbour and K-fold cross-validation used in wind turbines for false alarm detection,” Sustainable Futures, vol. 6, Dec. 2023, doi: 10.1016/j.sftr.2023.100132.

S. Ma, W. Cao, S. Jiang, J. Hu, X. Lei, and X. Xiong, “Design and implementation of SVM OTPC searching based on Shared Dot Product Matrix,” Integration, vol. 71, pp. 30–37, Mar. 2020, doi: 10.1016/j.vlsi.2019.11.007.

P. A. Riadi, M. R. Faisal, D. Kartini, R. A. Nugroho, D. T. Nugrahadi, and D. B. Magfira, “A Comparative Study of Machine Learning Methods for Baby Cry Detection Using MFCC Features,” Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 6, no. 1, Jan. 2024, doi: 10.35882/jeeemi.v6i1.350.

M. I. Mazdadi, I. Budiman, and R. Herteno, “IMPLEMENTATION OF INFORMATION GAIN AND PARTICLE SWARM OPTIMIZATION ON SENTIMENT ANALYSIS OF COVID-19 HANDLING USING K-NN,” Jurnal Informatika dan Komputer) Accredited KEMENDIKBUD RISTEK, vol. 6, no. 1, 2023, doi: 10.33387/jiko.v6i1.5260.

M. T. Hidayat, M. R. Faisal, D. Kartini, F. Indriani, I. Budiman, and T. H. Saragih, “Comparison of Machine Learning Performance on Classification of COVID-19 Cough Sounds Using MFCC Features Porównanie wydajności uczenia maszynowego w zakresie klasyfikacji odgłosów kaszlu COVID-19 przy użyciu funkcji MFCC,” 2023.

S. Lee, C. Lee, K. G. Mun, and D. Kim, “Decision Tree Algorithm Considering Distances between Classes,” IEEE Access, vol. 10, pp. 69750–69756, 2022, doi: 10.1109/ACCESS.2022.3187172.

E. Dritsas and M. Trigka, “Machine Learning Techniques for Chronic Kidney Disease Risk Prediction,” Big Data and Cognitive Computing, vol. 6, no. 3, Sep. 2022, doi: 10.3390/bdcc6030098.

N. H. Arif et al., “Approach to ECG-based Gender Recognition Using Random Forest Algorithm,” Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 6, no. 2, pp. 107–115, 2024, doi: 10.35882/jeeemi.v6i2.363.

D. Si, W. Hu, Z. Deng, and Y. Xu, “Fair hierarchical clustering of substations based on Gini coefficient,” Global Energy Interconnection, vol. 4, no. 6, pp. 576–586, Dec. 2021, doi: 10.1016/j.gloei.2022.01.009.

P. Kulkarni, S. Umarani, V. Diwan, V. Korde, and P. P. Rege, “Child Cry Classification - An Analysis of Features and Models,” in 2021 6th International Conference for Convergence in Technology, I2CT 2021, Institute of Electrical and Electronics Engineers Inc., Apr. 2021. doi: 10.1109/I2CT51068.2021.9418129.

D. Suleiman and G. Al-Naymat, “SMS Spam Detection using H2O Framework,” in Procedia Computer Science, Elsevier B.V., 2017, pp. 154–161. doi: 10.1016/j.procs.2017.08.335.

A. Candel, E. Ledell, and A. Bartz, “Deep Learning with H2O.” [Online]. Available: http://h2o.ai/resources/

A. Maniatopoulos and N. Mitianoudis, “Learnable Leaky ReLU (LeLeLU): An Alternative Accuracy-Optimized Activation Function,” Information (Switzerland), vol. 12, no. 12, Dec. 2021, doi: 10.3390/info12120513.

A. S. Imran, S. M. Daudpota, Z. Kastrati, and R. Batra, “Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on covid-19 related tweets,” IEEE Access, vol. 8, pp. 181074–181090, 2020, doi: 10.1109/ACCESS.2020.3027350.

D. López, I. Aguilera-Martos, M. García-Barzana, F. Herrera, D. García-Gil, and J. Luengo, “Fusing anomaly detection with false positive mitigation methodology for predictive maintenance under multivariate time series,” Information Fusion, vol. 100, Dec. 2023, doi: 10.1016/j.inffus.2023.101957.

A. Kulkarni, D. Chong, and F. A. Batarseh, “Foundations of data imbalance and solutions for a data democracy,” in Data Democracy: At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering, Elsevier, 2020, pp. 83–106. doi: 10.1016/B978-0-12-818366-3.00005-8.

A. M. Carrington et al., “Deep ROC Analysis and AUC as Balanced Average Accuracy to Improve Model Selection, Understanding and Interpretation,” Mar. 2021, doi: 10.1109/TPAMI.2022.3145392.

W. Gao, L. Wang, R. Jin, S. Zhu, and Z. H. Zhou, “One-pass AUC optimization,” Artif Intell, vol. 236, pp. 1–29, Jul. 2016, doi: 10.1016/j.artint.2016.03.003.

A. Br Haloho, Z. Hafy, and A. Annisa Rizki, “ISSN 2598 0580 Bioscientia Medicina Sensitivity and Specificity of Urine N-Acetyl-β-D-Glucosaminidase as an Early Biomarker For Acute Kidney Injury.” [Online]. Available: www.bioscmed.com