XGBoost for Predicting Airline Customer Satisfaction Based on Computational Efficient Questionnaire
DOI:
https://doi.org/10.21108/ijoict.v9i2.864Keywords:
Adaptive Boosting (AdaBoost), Prediction, Airline Customer, Missing ValueAbstract
Customer satisfaction can be created through a well-crafted service quality strategy, which forms the cornerstone of a successful business-customer relationship. Establishing and nurturing these relationships with customers is vital for long-term success. Within the airline industry, a persistent challenge lies in enhancing the passenger experience during flights, necessitating a comprehensive understanding of customer demands. Addressing this challenge is crucial for airlines aspiring to thrive in a competitive landscape, thus underlining the significance of providing top-notch services. This study addresses this issue by leveraging predictive airline customer satisfaction data analysis. We forecast customer satisfaction levels using a powerful Extreme Gradient Boosting (XGBoost) ensemble-based model. An integral aspect of our methodology involves handling missing values in the dataset, for which we utilize mean-value imputation. Furthermore, we introduce a novel logistic Pearson Gini (Log-PG) score to identify the factors that significantly influence airline customer satisfaction. In our predictive model, we achieved notable results, showing an accuracy and precision of 0.96. To ascertain the efficiency of our model, we conducted a comparative analysis with other boosting-type ensemble prediction models, such as gradient boosting and adaptive boosting (AdaBoost). The comparative assessment established the superiority of the XGBoost model in predicting airline customer satisfaction.
Downloads
References
[2] C.-F. Chen, “Investigating structural relationships between service quality, perceived value, satisfaction, and behavioral intentions for air passengers: Evidence from Taiwan,” Transp. Res. Part Policy Pract., vol. 42, no. 4, pp. 709–717, May 2008, doi: 10.1016/j.tra.2008.01.007.
[3] S. Tiernan, D. L. Rhoades, and B. Waguespack, “Airline service quality: Exploratory analysis of consumer perceptions and operational performance in the USA and EU,” Manag. Serv. Qual. Int. J., vol. 18, no. 3, pp. 212–224, May 2008, doi: 10.1108/09604520810871847.
[4] B. K. Behn and R. A. Riley, “Using Nonfinancial Information to Predict Financial Performance: The Case of the U.S. Airline Industry,” J. Account. Audit. Finance, vol. 14, no. 1, pp. 29–56, Jan. 1999, doi: 10.1177/0148558X9901400102.
[5] P. Kunekar, M. Deshpande, A. Gharpure, V. Gokhale, A. Gore, and H. Yadav, “Evaluating the Predictive Ability of the LightGBM Classifier for Assessing Customer Satisfaction in the Airline Industry,” in 2023 International Conference for Advancement in Technology (ICONAT), Goa, India: IEEE, Jan. 2023, pp. 1–6. doi: 10.1109/ICONAT57137.2023.10080120.
[6] S.-H. Park, M.-Y. Kim, Y.-J. Kim, and Y.-H. Park, “A Deep Learning Approach to Analyze Airline Customer Propensities: The Case of South Korea,” Appl. Sci., vol. 12, no. 4, p. 1916, Feb. 2022, doi: 10.3390/app12041916.
[7] R. Pranav and H. S. Gururaja, “Explainable Stacking Machine Learning Ensemble for Predicting Airline Customer Satisfaction,” in Third Congress on Intelligent Systems, S. Kumar, H. Sharma, K. Balachandran, J. H. Kim, and J. C. Bansal, Eds., in Lecture Notes in Networks and Systems, vol. 608. Singapore: Springer Nature Singapore, 2023, pp. 41–56. doi: 10.1007/978-981-19-9225-4_4.
[8] L. Eboli and G. Mazzulla, “An ordinal logistic regression model for analysing airport passenger satisfaction,” EuroMed J. Bus., vol. 4, no. 1, pp. 40–57, May 2009, doi: 10.1108/14502190910956684.
[9] W. Baswardono, D. Kurniadi, A. Mulyani, and D. M. Arifin, “Comparative analysis of decision tree algorithms: Random forest and C4.5 for airlines customer satisfaction classification,” J. Phys. Conf. Ser., vol. 1402, no. 6, p. 066055, Dec. 2019, doi: 10.1088/1742-6596/1402/6/066055.
[10] S. Ouf, “An Optimized Deep Learning Approach for Improving Airline Services,” Comput. Mater. Contin., vol. 75, no. 1, pp. 1213–1233, 2023, doi: 10.32604/cmc.2023.034399.
[11] C. Tan, “Bidirectional LSTM Model in Predicting Satisfaction Level of Passengers on Airline Service,” in 2021 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE), Hangzhou, China: IEEE, Nov. 2021, pp. 525–531. doi: 10.1109/ICAICE54393.2021.00107.
[12] E. Esmaeilzadeh and S. Mokhtarimousavi, “Machine Learning Approach for Flight Departure Delay Prediction and Analysis,” Transp. Res. Rec. J. Transp. Res. Board, vol. 2674, no. 8, pp. 145–159, Aug. 2020, doi: 10.1177/0361198120930014.
[13] H. B. Sankaranarayanan, B. V. Vishwanath, and V. Rathod, “An exploratory analysis for predicting passenger satisfaction at global hub airports using logistic model trees,” in 2016 Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), Kolkata, India: IEEE, Sep. 2016, pp. 285–290. doi: 10.1109/ICRCICN.2016.7813672.
[14] X. Jiang, Y. Zhang, Y. Li, and B. Zhang, “Forecast and analysis of aircraft passenger satisfaction based on RF-RFE-LR model,” Sci. Rep., vol. 12, no. 1, p. 11174, Jul. 2022, doi: 10.1038/s41598-022-14566-3.
[15] M. Abdurohman and A. G. Putrada, “Forecasting Model for Lighting Electricity Load with a Limited Dataset using XGBoost,” Kinet. Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control, pp. 197–206, Jul. 2019, doi: 10.22219/kinetik.v4i3.841.
[16] M. N. Fauzan, A. G. Putrada, N. Alamsyah, and S. F. Pane, “PCA-AdaBoost Method for a Low Bias and Low Dimension Toxic Comment Classification.,” in 2022 International Conference on Advanced Creative Networks and Intelligent Systems (ICACNIS), IEEE, 2022, pp. 1–6, doi: 10.1109/ICACNIS57039.2022.10055017.
[17] S. F. Pane, A. G. Putrada, N. Alamsyah, and M. N. Fauzan, “A PSO-GBR Solution for Association Rule Optimization on Supermarket Sales,” in 2022 Seventh International Conference on Informatics and Computing (ICIC), IEEE, 2022, pp. 1–6, doi: 10.1109/ICIC56845.2022.10007001.
[18] S. J. Parvez, “Data Analytics for Finding Loyalty of International Airline Passengers Using Deep Network MLP Combining with Machine Learning Algorithms on Python,” J. Adv. Res. Dyn. Control Syst., vol. 12, no. SP7, pp. 2886–2891, Jul. 2020, doi: 10.5373/JARDCS/V12SP7/20202431.
[19] S. M. Zahraee et al., “A study on airlines’ responses and customer satisfaction during the COVID-19 pandemic,” Int. J. Transp. Sci. Technol., p. S2046043022001009, Dec. 2022, doi: 10.1016/j.ijtst.2022.11.004.
[20] O. Atika, A. T. Junaedi, A. A. Purwati, and Z. Mustafa, “Work Discipline, Leadership, and Job Satisfaction on Organizational Commitment and Teacher Performance of State Junior High School in Bangko District, Rokan Hilir Regency,” J. Appl. Bus. Technol., vol. 3, no. 3, pp. 251–262, Sep. 2022, doi: 10.35145/jabt.v3i3.109.
[21] T. A. Prasetya, C. T. Harjanto, and A. Setiyawan, “Analysis of student satisfaction of e-learning using the end-user computing satisfaction method during the Covid-19 pandemic,” J. Phys. Conf. Ser., vol. 1700, no. 1, p. 012012, Dec. 2020, doi: 10.1088/1742-6596/1700/1/012012.
[22] D. R. Gopinath and D. R. Kalpana, “RELATIONSHIP OF JOB INVOLVEMENT WITH JOB SATISFACTION,” Adalya J., vol. 9, no. 7, Jul. 2020, doi: 10.37896/aj9.7/029.
[23] Md. A. I. Gazi, Md. A. Islam, J. Shaturaev, and B. K. Dhar, “Effects of Job Satisfaction on Job Performance of Sugar Industrial Workers: Empirical Evidence from Bangladesh,” Sustainability, vol. 14, no. 21, p. 14156, Oct. 2022, doi: 10.3390/su142114156.
[24] N. G. Ramadhan, Adiwijaya, and A. Romadhony, “Preprocessing Handling to Enhance Detection of Type 2 Diabetes Mellitus based on Random Forest,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 7, 2021, doi: 10.14569/IJACSA.2021.0120726.
[25] M. Liu et al., “Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques,” Artif. Intell. Med., vol. 142, p. 102587, Aug. 2023, doi: 10.1016/j.artmed.2023.102587.
[26] I. D. Oktaviani and A. G. Putrada, “KNN imputation to missing values of regression-based rain duration prediction on BMKG data,” J. INFOTEL, vol. 14, no. 4, Nov. 2022, doi: 10.20895/infotel.v14i4.840.
[27] M. Szczepa?ski, M. Pawlicki, R. Kozik, and M. Chora?, “The Application of Deep Learning Imputation and Other Advanced Methods for Handling Missing Values in Network Intrusion Detection,” Vietnam J. Comput. Sci., vol. 10, no. 01, pp. 1–23, Feb. 2023, doi: 10.1142/S2196888822500257.
[28] N. G. Ramadhan, “Data Mining Techniques in Handling Personality Analysis for Ideal Customers,” J. OfInformation Syst. Eng. Bus. Intell., vol. 8, no. 2, pp. 175–181, Oct. 2022, doi: 10.20473/jisebi.8.2.175-182.
[29] W.-C. Lin and C.-F. Tsai, “Missing value imputation: a review and analysis of the literature (2006–2017),” Artif. Intell. Rev., vol. 53, no. 2, pp. 1487–1509, Feb. 2020, doi: 10.1007/s10462-019-09709-4.
[30] T. Aittokallio, “Dealing with missing values in large-scale studies: microarray data imputation and beyond,” Brief. Bioinform., vol. 11, no. 2, pp. 253–264, Mar. 2010, doi: 10.1093/bib/bbp059.
[31] K. Shyamala and C. S. Padmasini, “Mtanh-Attention-BiLSTM model for prediction of Automobile Export Customer Satisfaction,” in 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), IEEE, 2021, pp. 652–660, doi: 10.1109/ICAIS50930.2021.9395837.
[32] C. Kadilar and H. Cingi, “Ratio estimators in simple random sampling,” Appl. Math. Comput., vol. 151, no. 3, pp. 893–902, Apr. 2004, doi: 10.1016/S0096-3003(03)00803-8.
[33] S. Elmståhl, J. Sanmartin Berglund, C. Fagerström, and H. Ekström, “The life satisfaction index-A (LSI-A): normative data for a general Swedish population aged 60 to 93 years,” Clin. Interv. Aging, pp. 2031–2039, 2020, doi: 10.2147/CIA.S275387.
[34] K. Patidar, R. K. Gour, A. Dixit, M. Verma, and A. K. Pal, “An Improved Method for the Data Cluster Based Feature Selection and Classification,” in 2023 International Conference for Advancement in Technology (ICONAT), Goa, India: IEEE, Jan. 2023, pp. 1–6. doi: 10.1109/ICONAT57137.2023.10080669.
[35] S. Buyruko?lu and A. Akba?, “Machine Learning based Early Prediction of Type 2 Diabetes: A New Hybrid Feature Selection Approach using Correlation Matrix with Heatmap and SFS,” Balk. J. Electr. Comput. Eng., vol. 10, no. 2, pp. 110–117, Apr. 2022, doi: 10.17694/bajece.973129.
[36] F. Viton, M. Elbattah, J.-L. Guerin, and G. Dequen, “Heatmaps for Visual Explainability of CNN-Based Predictions for Multivariate Time Series with Application to Healthcare,” in 2020 IEEE International Conference on Healthcare Informatics (ICHI), Oldenburg, Germany: IEEE, Nov. 2020, pp. 1–8. doi: 10.1109/ICHI48887.2020.9374393.
[37] M. B. Satrio, A. G. Putrada, and M. Abdurohman, “Evaluation of Face Detection and Recognition Methods in Smart Mirror Implementation,” in Proceedings of Sixth International Congress on Information and Communication Technology, Springer, 2022, pp. 449–457, doi: 10.1007/978-981-16-2380-6_39.
[38] Hong Han, Xiaoling Guo, and Hua Yu, “Variable selection using Mean Decrease Accuracy and Mean Decrease Gini based on Random Forest,” in 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China: IEEE, Aug. 2016, pp. 219–224. doi: 10.1109/ICSESS.2016.7883053.
[39] A. G. Putrada, M. Abdurohman, D. Perdana, and H. H. Nuha, “Machine Learning Methods in Smart Lighting Toward Achieving User Comfort: A Survey,” IEEE Access, vol. 10, pp. 45137–45178, 2022, doi: http://doi.org/10.1109/ACCESS.2022.3169765.
[40] N. K. Le et al., “Fedxgboost: Privacy-preserving xgboost for federated learning,” ArXiv Prepr. ArXiv210610662, 2021, doi: https://doi.org/10.48550/arXiv.2106.10662.
[41] A. G. Putrada, N. Alamsyah, S. F. Pane, and M. N. Fauzan, “XGBoost for IDS on WSN Cyber Attacks with Imbalanced Data,” in 2022 International Symposium on Electronics and Smart Devices (ISESD), Nov. 2022, pp. 1–7. doi: http://doi.org/10.1109/ISESD56103.2022.9980630.
[42] Z. Zhang, Y. Zhao, A. Canes, D. Steinberg, and O. Lyashevska, “Predictive analytics with gradient boosting in clinical medicine,” Ann. Transl. Med., vol. 7, no. 7, 2019, doi: 10.21037/atm.2019.03.29.
[43] A. G. Putrada and D. Perdana, “Improving Thermal Camera Performance in Fever Detection during COVID-19 Protocol with Random Forest Classification,” in 2021 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bali, Indonesia: IEEE, Oct. 2021, pp. 1–6. doi: 10.1109/ICADEIS52521.2021.9702045.
[44] L. Zhou and K. K. Lai, “AdaBoost Models for Corporate Bankruptcy Prediction with Missing Data,” Comput. Econ., vol. 50, no. 1, pp. 69–94, Jun. 2017, doi: 10.1007/s10614-016-9581-4.
[45] N. A. C. Cressie and H. J. Whitford, “How to Use the Two Samplet-Test,” Biom. J., vol. 28, no. 2, pp. 131–148, 1986, doi: 10.1002/bimj.4710280202.
[46] X. Deng, M. Li, S. Deng, and L. Wang, “Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification,” Med. Biol. Eng. Comput., vol. 60, no. 3, pp. 663–681, 2022, doi: https://doi.org/10.1007/s11517-021-02476-x.
[47] A. Luque, A. Carrasco, A. Martín, and A. de Las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit., vol. 91, pp. 216–231, 2019, doi: https://doi.org/10.1016/j.patcog.2019.02.023.
[48] N. G. Ramadhan and F. D. Adhinata, “Sentiment analysis on vaccine COVID-19 using word count and Gaussian Naïve Bayes,” Indones. J. Electr. Eng. Comput. Sci., vol. 26, no. 3, p. 1765, Jun. 2022, doi: 10.11591/ijeecs.v26.i3.pp1765-1772.
[49] N. G. Ramadhan, M. Wibowo, N. F. L. Mohd Rosely, and C. Quix, “Opinion mining indonesian presidential election on twitter data based on decision tree method,” J. INFOTEL, vol. 14, no. 4, Nov. 2022, doi: 10.20895/infotel.v14i4.832.
Downloads
Published
How to Cite
Issue
Section
License
Manuscript submitted to IJoICT has to be an original work of the author(s), contains no element of plagiarism, and has never been published or is not being considered for publication in other journals. Author(s) shall agree to assign all copyright of published article to IJoICT. Requests related to future re-use and re-publication of major or substantial parts of the article must be consulted with the editors of IJoICT.