Penilaian Kredit Menggunakan Algoritma XGBoost dan Logistic Regression

Ainul Yaqin

Abstract


Penilaian kredit merupakan suatu proses atau sistem
yang digunakan oleh lembaga pembiayaan atau bank untuk
menilai kelayakan seseorang yang mengajukan pinjaman. Hal
ini sangat diperlukan untuk menghindari kerugian akibat gagal
bayar. Menanggapi hal tersebut dibutuhkan sebuah metode
yang efisien, cepat dan akurat untuk mengklasifikasikan layak
atau tidaknya seseorang untuk diberikan pinjaman. Penulis
mengusulkan metode machine learning dan membandingkan
algoritma XGBoost dan logistic regression. Setelah dilatih dan
diuji dengan stratified kfold cross validation, XGBoost
menghasilkan rata-rata akurasi 85,51%; F1 Score 83,81%;
precision 83,80% dan recall 84,04% sedangkan logisitc
regression menghasilkan rata-rata akurasi 85,94%; F1 Score
85,36%; precision 80,08%; dan recall 91,52%. Kedua algoritma
dapat mengklasifikasikan layak atau tidaknya seseorang untuk
diberikan pinjaman dengan baik, sehingga dapat digunakan
untuk membantu institusi keuangan maupun para analis kredit.


Keywords


penilaian kredit, klasifikasi, XGBoost, logistic regression, data mining

Full Text:

References


F. Yang, Y. Qiao, C. Huang, S. Wang, and X. Wang, “An Automatic Credit Scoring Strategy (ACSS) using memetic evolutionary algorithm and neural architecture search,” Appl. Soft Comput., vol. 113, 2021, doi: 10.1016/j.asoc.2021.107871.

D. Tripathi, D. R. Edla, R. Cheruku, and V. Kuppili, “A novel hybrid credit scoring model based on ensemble feature selection and multilayer ensemble classification,” Comput. Intell., vol. 35, no. 2, pp. 371–394, 2019, doi: 10.1111/coin.12200.

D. Boughaci, A. A. K. Alkhawaldeh, J. J. Jaber, and N. Hamadneh, “Classification with segmentation for credit scoring and bankruptcy prediction,” Empir. Econ., vol. 61, no. 3, pp. 1281–1309, 2021, doi: 10.1007/s00181-020-01901-8.

J. P. Barddal, L. Loezer, F. Enembreck, and R. Lanzuolo, “Lessons learned from data stream classification applied to credit scoring,” Expert Syst. Appl., vol. 162, no. July, p. 113899, 2020, doi: 10.1016/j.eswa.2020.113899.

V. Kuppili, D. Tripathi, and D. Reddy Edla, “Credit score classification using spiking extreme learning machine,” Comput. Intell., vol. 36, no. 2, pp. 402–426, 2020, doi: 10.1111/coin.12242.

D. Şen, C. Ç. Dönmez, and U. M. Yıldırım, “A Hybrid Bi-level Metaheuristic for Credit Scoring,” Inf. Syst. Front., vol. 22, no. 5, pp. 1009–1019, 2020, doi: 10.1007/s10796-020-10037-0.

Y. Wang and X. S. Ni, “A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization,” Int. J. Database Manag. Syst., vol. 11, no. 1, pp. 1–17, Jan. 2019, doi: https://doi.org/10.48550/arXiv.1901.08433.

J. Chen, F. Zhao, Y. Sun, and Y. Yin, “Improved XGBoost model based on genetic algorithm,” Int. J. Comput. Appl. Technol., vol. 62, no. 3, p. 240, 2020, doi: 10.1504/IJCAT.2020.106571.

D. J. Foster, S. Kale, H. Luo, M. Mohri, and K. Sridharan, “Logistic Regression: The Importance of Being Improper,” Proc. Mach. Learn. Res., vol. 75, pp. 1–42, Mar. 2018, [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/sam.11428?af=R.

D. Wang and Z. Zhang, “Credit Scoring Using Information Fusion Technique,” in 2018 7th International Conference on Digital Home (ICDH), Nov. 2018, pp. 154–159, doi: 10.1109/ICDH.2018.00036.

Hermawan and Yoannita, “Komparasi Metode Evaluasi Pada Credit Scoring Data Mining,” Jtksi, vol. 01, no. 02, pp. 22–25, 2018.

M. H. Rifqo and A. Wijaya, “Implementasi Algoritma Naive Bayes Dalam Penentuan Pemberian Kredit,” Pseudocode, vol. 4, no. 2, pp. 120–128, 2017, doi: 10.33369/pseudocode.4.2.120-128.

J. Li, H. Liu, Z. Yang, and L. Han, “A Credit Risk Model with Small Sample Data Based on G-XGBoost,” Appl. Artif. Intell., vol. 35, no. 15, pp. 1550–1566, 2021, doi: 10.1080/08839514.2021.1987707.

A. Ibrahem Ahmed Osman, A. Najah Ahmed, M. F. Chow, Y. Feng Huang, and A. El-Shafie, “Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia,” Ain Shams Eng. J., vol. 12, no. 2, pp. 1545–1556, 2021, doi: 10.1016/j.asej.2020.11.011.

J. Zhou, Y. Qiu, S. Zhu, D. J. Armaghani, M. Khandelwal, and E. T. Mohamad, “Estimation of the TBM advance rate under hard rock conditions using XGBoost and Bayesian optimization,” Undergr. Sp., vol. 6, no. 5, pp. 506–515, 2021, doi: 10.1016/j.undsp.2020.05.008.

A. S. Hess and J. R. Hess, “Logistic regression,” Transfusion, vol. 59, no. 7, pp. 2197–2198, Jul. 2019, doi: 10.1111/trf.15406.

S. Sperandei, “Understanding logistic regression analysis,” Biochem. Medica, vol. 24, no. 1, pp. 12–18, 2014, doi: 10.11613/BM.2014.003.




DOI: https://doi.org/10.30591/jpit.v8i1.4337

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

JPIT INDEXED BY