Classification of Hadith Topic of Indonesian Translation Using K-Nearest Neighbor and Chi-Square
DOI:
https://doi.org/10.21108/ijoict.v7i2.573Keywords:
chi-square, hadith, text classification, k-nearest neighbor.Abstract
Hadith is the main way of life for Muslims besides the Qur'an whose can be applied in everyday life. Hadith also contains all the words or deeds of the Prophet Muhammad which are used as a source of the law of Islam. Therefore, many readers, especially Muslims, are interested in studying hadith. However, the large number of hadiths makes it difficult for readers or those who are still unfamiliar with Islam to read them. Therefore, we conducted a study to classify hadith textually based on the type of teaching, so that readers can get an overview or other reference in reading and searching for hadith based on the type of teaching more easily. This study uses KNN and chi-square methods as feature selection. We also carried out several test scenarios, including implementing stopword removal modifications in preprocessing and experimenting with selecting k values ​​for KNN to determine the best performance. The best performance was obtained by using the value of k = 7 on KNN without implementing chi-square and with stopword removal modification with a hammer loss value of 0.1042 or about 89.58% of the data correctly classified.
Downloads
References
[2] Ling Zhang Min, Hua Zhou Zhi, “A k-Nearest Neighbor Based Algorithm for Multi-label Classification†in IEEE International Conference on Granular Computing. 2005. pp. 718-721.
[3] Shichao Zhang, et al., “Learning k for KNN Classification†in CM Transactions on Intelligent Systems and Technology. 2017.
[4] Nikmah Isnaini, Adiwijaya, Mohamad Syahrul Mubarok, Muhammad Yuslan Abu Bakar, “A multi-label classification on topic of Indonesian news using K-Nearest Neighbor†in International Conference on Data and Information Science. 2019.
[5] G I Ulumudin, Adiwijaya, M S Mubarok, “A multilabel classification on topics of qur’anic verses in English translation using K-Nearest Neighbor method with Weighted TF-IDF†in International Conference on Data and Information Science. 2019.
[6] Dian Chusnul Hidayati, Said Al Faraby, Adiwijaya, “Classification of Multi Label Topics in Sahih Bukhari Hadith Using K-Nearest Neighbor and Latent Semantic Analysisâ€, in JURIKOM (Journal of Computer Research) Vol. 7 No. 1, 2020.
[7] Adiwijaya, et al., “A comparative study of MFCC-KNN and LPC-KNN for hijaiyyah letters pronounciation classification system†in International Conference on Information and Communication Technology (ICoICT). 2017.
[8] Yan Hong Li, “Text feature selection algorithm based on Chi-square rank correlation factorization†in Journal of Interdisciplinary Mathematics Vol. 20 No. 1 pp. 153–160. 2017.
[9] Syair Audi Sacra, Said Al Faraby, Danang Triantoro M, “Classification of Recommendations, Prohibitions, and Information on Sahih Bukhari Hadith Using Naïve Bayes Classifierâ€, in e-Proceeding of Engineering: Vol.4 No.3 PP. 4794 – 4802, 2017.
[10] Yujia Zhai, et al., “A Chi-square Statistics Based Feature Selection Method in Text Classification†in 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS). 2018.
[11] Mahendra Dwifebri Purbolaksono, et al., “Indonesian text classification using backpropagation and sastrawi steming analysis with Information Gain for selection feature†in International Journal on Advanced Science Engineering and Information Technology Vol. 2 pp. 60-65, 2020.
[12] Khitam Jbara, “Knowledge Discovery in Al-Hadith Using Text Classification Algorithm†Vol. 6. No. 11 pp. 409–419, 2010.
[13] Masoumeh Zareapoor, K. R. Seeja, “Feature extraction or feature selection for text classification: A case study on phishing email detection†in International Journal of Information Engineering and Electronic Business, 2015.
[14] Said Al Faraby, et al., “Classification of hadith into positive suggestion, negative suggestion, and information†in International Conference on Data and Information Science, 2018.
[15] Juen Ling, I Putu Eka N Kencana, Bagus Tjokorda Oka., “Analysis Sentiment Using Naïve Bayes Classifier With Feature Selection Chi Square†in E-Journal Mathematic Vol. 3 pp. 92-99, 2014.
[16] Huijuan Li, et al., “An Improved KNN Algorithm for Text Classification†in International Conference on Instrumentation and Measurement Computer Communication and Control, 2018.
[17] Muhammad Yuslan Abu Bakar, Adiwijaya, Said Al Faraby., “Multi-Label Topic Classification of Hadith of Bukhari (Indonesian Language translation) using Information Gain and Backpropagation Neural Network†in International Conference on Asian Language Processing (IALP), 2018.
Downloads
Published
How to Cite
Issue
Section
License
Manuscript submitted to IJoICT has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals. Author(s) shall agree to assign all copyright of published article to IJoICT. Requests related to future re-use and re-publication of major or substantial parts of the article must be consulted with the editors of IJoICT.