Pembangunan Pensejajaran Kata Monolingual (Monolingual Word Alignment) pada Terjemahan Al-Quran Bahasa Indonesia
DOI:
https://doi.org/10.34818/INDOJC.2019.4.2.331Abstract
Paper ini membahas tentang pembangunan pensejajaran kata monolingual pada terjemahan Al-Quran Bahasa Indonesia. Topik ini diambil karena alignment merupakan komponen utama dari beberapa pemrosesan bahasa alami yaitu textual entailment recognition, textual similarity identification, paraphrase detection, question answering dan text summarization. Selain itu terjemahan Al-Quran ini sangat banyak versinya sehingga membutuhkan penafsiran untuk mengartikan terjemahan yang berbeda namun memiliki makna yang sama. Dengan adanya teknik ini beberapa kata terjemahan Al-Quran yang berbeda dapat disejajarkan, sehingga kata-kata tersebut akan terkelompokkan berdasarkan kemiripan semantiknya. Teknik ini juga dapat digunakan lebih lanjut untuk pembangunan synonim set dan WordNet. Inputan dari sistem berupa pasangan terjemahan untuk sebuah ayat yang sama. Evaluasi pada penelitian ini menghasilkan skor korelasi 0.82 dengan nilai toleransi kesalahan pada sistem sebesar 0.18. Korelasi antar ayat yang memiliki kemiripan semantik sejauh penelitian ini sudah dapat dikatakan memadai namun jika ingin ditingkatkan kembali maka diperlukannya fitur dan basis pengetahuan yang lebih lengkap lagi.Downloads
References
Christopher D.Manning Bill MacCartney, Michel Galley. A Phrase-Based Alignment Model for Natural Language Inference.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Association for Computational
Linguistics, 2008.
Chris Brockett. Aligning The RTE 2006 Corpus. Natural Language Processing Group, Microsoft Research Technical Report
MSR-TR-2007-77, 2007.
Tomas Brychcin and Lukas Svoboda. UWB at SemEval-2016 Task 1: Semantic textual similarity using lexical, syntactic, and
semantic information. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016), San Diego,
CA, 2016.
Mitsuru Ishizuka Danushka Bollegala, Yutaka Matsuo. Measuring Semantic Similarity between Words Using Web Search
Engines. Information Systems, Information Search and Retrieval, 2017.
Cd Fujita, Inuiárez. A Class-oriented Approach to Building a Paraphrase Corpus, volume 1. 2003.
Kathleen McKeown Kapil Thadani. Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
Association for Computational Linguistics, 2011.
Hujair A. H. Sanaky. Metode Tafsir [Perkembangan Metode Tafsir Mengikuti Warna Atau Corak Mufassirin. Al-Mawarid
Edisi XVIII Tahun 2008, 2008.
Sultan Md Arafat Steven Bethard and Tamara Sumner. Back to Basics for Monolingual Alignment: Exploiting Word Similarity
and Contextual Evidence. aclweb.org, 2014.
Chris Callison-Burch Xuchen Yao, Benjamin Van Durme and Peter Clark. A Lightweight and High-Performance Monolingual
Word Aligner. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Association for
Computational Linguistics, 2013a.
Chris Callison-Burch Xuchen Yao, Benjamin Van Durme and Peter Clark. Semi-Markov Phrase-based Monolingual Alignment.
In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Association for Computational
Linguistics, 2013b
Downloads
Published
How to Cite
Issue
Section
License
- Manuscript submitted to IndoJC has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals.Â
- Copyright on any article is retained by the author(s). Regarding copyright transfers please see below.
- Authors grant IndoJC a license to publish the article and identify itself as the original publisher.
- Authors grant IndoJC commercial rights to produce hardcopy volumes of the journal for sale to libraries and individuals.
- Authors grant any third party the right to use the article freely as long as its original authors and citation details are identified.
- The article and any associated published material is distributed under the Creative Commons Attribution 4.0License