Sentiment Analysis on Social Media Using Word2Vec and Gated Recurrent Unit (GRU) with Genetic Algorithm Optimization
DOI:
https://doi.org/10.21108/ijoict.v10i1.903Keywords:
Genetic Algorithm, GRU, Sentiment Analysis, TF-IDF, Word2VecAbstract
The evolution of information technology has changed the function of social media from a mere information repository to a platform for expressing opinions and aspirations. One of the most used social media is Twitter. Twitter users can express opinions according to their conscience. Therefore, a sentiment analysis process is needed to classify the opinion as positive or negative. Sentiment analysis on social media is important to understand user opinions, monitor public perception, measure campaign performance, identify trends and opportunities, and improve customer service. This research builds a model to perform sentiment analysis on the topic the president election with a total dataset of 39,791 with GRU method, TF-IDF feature extraction, Word2Vec feature expansion with 142,545 corpus from IndoNews, and Genetic Algorithm optimization. The test results show that the highest accuracy achieved is 83.39%, which shows an improvement of 1.42% compared to the baseline. This performance was achieved when combining of TF-IDF with a 5,000 maximum features, applying Word2Vec at top 1 similarity, and applying Genetic Algorithm for feature optimization. This study proves the relationship between the use of Word2Vec feature expansion and Genetic Algorithms as optimization in improving the accuracy of the model created.
Downloads
References
[2] Kamil, G., & Setiawan, E. B. (2023). Aspect-Level Sentiment Analysis on Social Media Using Gated Recurrent Unit (GRU). Building of Informatics, Technology and Science (BITS), 4(4), 1837-1844.
[3] www.kominfo.go.id, “Indonesia Peringkat 5 Pengguna Twitter” 2022. https://www.kominfo.go.id/content/detail/2366/%20indonesia-peringkatlima-penggunatwitter/0/sorotan_media (accessed Apr. 20, 2023).
[4] Birjali, M., Kasri, M., & Beni-Hssane, A. (2021). A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowledge-Based Systems, 226, 107134.
[5] Kanojia, D., & Joshi, A. (2023). Applications and Challenges of Sentiment Analysis in Real-life Scenarios. arXiv preprint arXiv:2301.09912.
[6] Huang, Z., Yang, F., Xu, F., Song, X., & Tsui, K. L. (2019). Convolutional gated recurrent unit–recurrent neural network for state-of-charge estimation of lithium-ion batteries. Ieee Access, 7, 93139-93149.
[7] Santur, Y. (2019, September). Sentiment analysis based on gated recurrent unit. In 2019 International Artificial Intelligence and Data Processing Symposium (IDAP) (pp. 1-5). IEEE.
[8] Al Wazrah, A., & Alhumoud, S. (2021). Sentiment analysis using stacked gated recurrent unit for arabic tweets. IEEE Access, 9, 137176-137187.
[9] Hidayatullah, A. F., Cahyaningtyas, S., & Hakim, A. M. (2021, February). Sentiment analysis on twitter using neural network: Indonesian presidential election 2019 dataset. In IOP Conference Series: Materials Science and Engineering (Vol. 1077, No. 1, p. 012001). IOP Publishing.
[10] Xing, Y., & Xiao, C. (2019, August). A GRU Model for Aspect Level Sentiment Analysis. In Journal of Physics: Conference Series (Vol. 1302, No. 3, p. 032042). IOP Publishing.
[11] F. W. Kurniawan and W. Maharani, “Indonesian Twitter Sentiment Analysis Using Word2Vec,” 2020 Int. Conf. Data Sci. Its Appl. ICoDSA 2020, pp. 31–36, 2020, doi: 10.1109/ICoDSA50139.2020.9212906.
[12] Rezeki, S. R. I. (2020). Penggunaan Sosial Media Twitter dalam Komunikasi Organisasi (Studi Kasus Pemerintah Provinsi Dki Jakarta Dalam Penanganan Covid-19). Journal of Islamic and Law Studies, 4(2).
[13] Sachin, S., Tripathi, A., Mahajan, N., Aggarwal, S., & Nagrath, P. (2020). Sentiment analysis using gated recurrent neural networks. SN Computer Science, 1, 1 13.
[14] Alkahfi, I., & Chiuloto, K. (2021). Penerapan Model Gated Recurrent Unit Pada Masa Pandemi Covid-19 Dalam Melakukan Prediksi Harga Emas Dengan Menggunakan Model Pengukuran Mean Square Error. Snastikom Ke, 8, 225-232.
[15] Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The impact of features extraction on the sentiment analysis. Procedia Computer Science, 152, 341-348
[16] Iqbal, F., Hashmi, J. M., Fung, B. C., Batool, R., Khattak, A. M., Aleem, S., & Hung, P. C. (2019). A hybrid framework for sentiment analysis using genetic algorithm based feature reduction. IEEE Access, 7, 14637-14652.
[17] J. S. Lee, D. Zuba, and Y. Pang, “Sentiment analysis of Chinese product reviews using gated recurrent unit,” Proc. - 5th IEEE Int. Conf. Big Data Serv. Appl. BigDataService 2019, Work. Big Data Water Resour. Environ. Hydraul. Eng. Work. Medical, Heal. Using Big Data Technol., pp. 173–181, 2019, doi: 10.1109/BigDataService.2019.00030.
[18] F. W. Kurniawan and W. Maharani, “Indonesian Twitter Sentiment Analysis Using Word2Vec,” 2020 Int. Conf. Data Sci. Its Appl. ICoDSA 2020, pp. 31–36, 2020, doi: 10.1109/ICoDSA50139.2020.9212906
[19] Bania, R. K. (2020). COVID-19 public tweets sentiment analysis using TF-IDF and inductive learning models. INFOCOMP Journal of Computer Science, 19(2), 23-41.
[20] R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, “The impact of features extraction on the sentiment analysis,” in Procedia Computer Science, 2019, vol. 152, pp. 341–348. doi: 10.1016/j.procs.2019.05.008.
[21] Savytska, L. V., Vnukova, N. M., Bezugla, I. V., Pyvovarov, V., & Sübay, M. T. (2021). Using Word2vec technique to determine semantic and morphologic similarity in embedded words of the Ukrainian language.
[22] Zaman, L., Sumpeno, S., & Hariadi, M. (2019). Analisis Kinerja LSTM dan GRU sebagai Model Generatif untuk Tari Remo. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, 8(2), 142-150.
[23] Katoch, S., Chauhan, S. S., & Kumar, V. (2021). A review on genetic algorithm: past, present, and future. Multimedia tools and applications, 80, 8091-8126.
[24] Wang, Z., & Sobey, A. (2020). A comparative review between Genetic Algorithm use in composite optimization and the state-of-the-art in evolutionary computation. Composite Structures, 233, 111739.
[25] Pratiwi, B. P., Handayani, A. S., & Sarjana, S. (2020). Pengukuran Kinerja Sistem Kualitas Udara Dengan Teknologi Wsn Menggunakan Confusion Matrix. Jurnal Informatika Upgris, 6(2).
[26] A. Suresh, “What is a confusion matrix?,” 2020. https://medium.com/analytics-vidhya/what-is-a-confusion-matrix-d1c0f8feda5 (accessed May 14, 2023).
Downloads
Published
How to Cite
Issue
Section
License
Manuscript submitted to IJoICT has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals. Author(s) shall agree to assign all copyright of published article to IJoICT. Requests related to future re-use and re-publication of major or substantial parts of the article must be consulted with the editors of IJoICT.