The Generating Indonesian Paraphrased Sentences with Verbal Predicate Replacement

Authors

  • Bunyamin Telkom University
  • Arie Ardiyanti Suryani The School of Computing, Telkom University

DOI:

https://doi.org/10.34818/INDOJC.2023.8.3.709

Keywords:

Paraphrase, Predicate, Lexical Substitution, Semantic Similarity, word2vec

Abstract

Sentence paraphrasing is restating sentences using different diction without changing the meaning of the language. Paraphrasing sentences can be done in several ways, including synonym substitution techniques, changing sentence forms, or replacing the predicate part of sentence. This research aims to produce a paraphrased sentence generator with semantic similarities to the original sentence. The paraphrasing used in this research is to identify the verb type predicate in simple sentences using PoS Tagging. Then look for words similar to the predicate using the similarity of the word2vec model. A list of opposites antonyms is used to improve the lexical substitution results. Evaluation is done by using human judgment between the results and the original sentence. The experimental results show that of the 600 sentence datasets, 48.37% of the sentences have semantic similarities, 20.93% have semantic reductions, and 30.70% have no semantic similarities.

Downloads

Download data is not yet available.

References

[1] Badan Pengembangan dan Pembinaan Bahasa, Kementerian Pendidikan, Kebudayaan, Riset dan Teknologi Republik Indonesia, Kamus Besar Bahasa Indonesia KBBI, [online] Available at: http:// kbbi.web.id [Accessed 10 October 2022], 2016.
[2] R. Bhagat and E. Hovy, “What is a Paraphrase?” in Computational Linguistics, 2013, 39, 3, pp. 463-472.
[3] G. Hintz, “Data-driven Paraphrasing and Stylistic Harmonization” in Proceedings of NAACL-HLT, San Diego, California: Association for Computational Linguistics, 2016, pp. 37-44.
[4] Xu, W., Ritter, A., Dollan, W. B., Grishman, R., & Cherry, C. Paraphrasing for Style. Proceedings of COLING 2012: Technical Papers, pp. 2899–2914. Mumbai: ACL. 2012.
[5] Kaji, N., Okamoto, M., & Kurohashi, S. “Paraphrasing Predicates from Written Language to Spoken Language Using the Web”. Human Language Technology Conference of the North American Chapter HLT NAACL pp. 241-248. Boston: the Association for Computational Linguistics. 2004
[6] Barmawi, A. M., & Muhammad, A. Paraphrasing Method Based on Contextual Synonym Substitution. J. ICT Res. Appl., 257-282. 2019.
[7] H. Alwi, S. Dardjowidjojo, H. Lapoliwa, and A. M. Moeliono, Tata Bahasa Baku Bahasa Indonesia, Edisi Ketiga. Pusat Bahasa dan Balai Pustaka, Jakarta, 2010, p. 498.
[8] Y. Wibisono, “POS Tagger Bahasa Indonesia dengan Python”. [Online]. Available: https://yudiwbs.wordpress.com/2018/02/20/pos-tagger-bahasa-indonesia-dengan-pytho/ [accessed: 4 Oct 2021].
[9] Bunyamin, A. F. Huda, and A. Ardiyanti. Indonesian Stemmer for ambiguous word based on context. ICODSA 2021. p. 1-9.
[10] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. “Distributed Representations of Words and Phrases and their Compositionality”, pp. 1–9, 2013.

Downloads

Published

2023-12-05

How to Cite

Bunyamin, & Suryani, A. A. (2023). The Generating Indonesian Paraphrased Sentences with Verbal Predicate Replacement. Indonesian Journal on Computing (Indo-JC), 8(3), 1–10. https://doi.org/10.34818/INDOJC.2023.8.3.709

Issue

Section

Computer Science