ML-based Approach for Credit Risk Assessment Using Parallel Calculations

dc.contributor.authorHentosh, L.
dc.contributor.authorTsikalo, Y.
dc.contributor.authorKustra, N.
dc.contributor.authorKutucu, H.
dc.date.accessioned2024-09-29T16:22:40Z
dc.date.available2024-09-29T16:22:40Z
dc.date.issued2022
dc.departmentKarabük Üniversitesien_US
dc.description3rd International Workshop on Computational and Information Technologies for Risk-Informed Systems, CITRisk 2022 -- 12 January 2023 -- Virtual, Online -- 189474en_US
dc.description.abstractIn banks and other credit organizations, the task of credit scoring often arises when making decisions on granting loans. The last one consists of making a reasoned decision based on information about the applicant, whether she should be granted a loan, and, if so, on what terms. This paper proposes the application of parallel calculations of the Random forest algorithm when solving the credit scoring task. This approach made it possible to reduce the time of model training and dataset processing significantly. Expectedly, when applying less data, the resulting acceleration and efficiency worsen. Using only 2500 entries, the execution time of the sequential algorithm is less than the parallel algorithm. The developed software was tested on three different processors: 4-core, 8-core, and 12-core, to evaluate the parallelization quality of data pre-processing. The classification algorithm is computationally complex and time-consuming, so we obtained practically the same acceleration for processing 5000 and 10000 records. With this amount of data, the 12-core processor gave the biggest gain in time when working with 12 threads. As a result, it is possible to have an acceleration of more than 6. This efficiency indicator of the proposed parallel algorithm can be significantly improved by varying the number of threads and considering the current trends in developing the multi-core architecture of computing systems. Also, using data without pre-processing, the following evaluation metrics were obtained: AUC=0.9 and Precision=0.845, and using data after pre-processing, these metrics were: AUC=0.86, Precision=0.89. © 2022 Copyright for this paper by its authors.en_US
dc.identifier.endpage173en_US
dc.identifier.issn1613-0073
dc.identifier.scopus2-s2.0-85163841243en_US
dc.identifier.scopusqualityN/Aen_US
dc.identifier.startpage161en_US
dc.identifier.urihttps://hdl.handle.net/20.500.14619/10212
dc.identifier.volume3422en_US
dc.indekslendigikaynakScopusen_US
dc.language.isoenen_US
dc.publisherCEUR-WSen_US
dc.relation.ispartofCEUR Workshop Proceedingsen_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectaccelerationen_US
dc.subjectclassification tasken_US
dc.subjectCredit scoringen_US
dc.subjectparallel algorithmen_US
dc.subjectRandom foresten_US
dc.titleML-based Approach for Credit Risk Assessment Using Parallel Calculationsen_US
dc.typeConference Objecten_US

Dosyalar