Identifying Optimal Parameters And Their Impact For Predicting Credit Card Defaulters Using Machine-Learning Algorithms

  • Muhammad Qasim Idrees Department of Computer Science & Information Technology, Virtual University of Pakistan
  • Humaira Naeem Department of Computer Science & Information Technology, Virtual University of Pakistan
  • Muhammad Imran Department of Computer Science & Information Technology, Virtual University of Pakistan
  • Asma Batool Department of Computer Science & Information Technology, Virtual University of Pakistan
  • Nadia Tabassum Department of Computer Science & Information Technology, Virtual University of Pakistan
Keywords: Data Mining, Machine Learning, Credit Scoring, Classification, Feature Selection, Gain Ratio, Optimal Parameters.

Abstract

Data mining and Machine learning are the emerging technologies that are rapidly spreading in every field of life due to their beneficial aspects. The financial sector also makes use of these technologies. Many research studies regarding banking data analysis have been performed using machine learning techniques. These research studies also have many Problems as the main focus of these studies was to achieve high accuracy and some of them only perform comparative analysis of different classifier's performance. Another major drawback of these studies was that they do not identify any optimal parameters and their impact. In this research, we have identified optimal parameters. These parameters are valuable for performing the credit scoring process and might also be used to predict credit card defaulters. We also find their impact on the results. We have used feature selection and classification techniques to identify optimal parameters and their impact on credit card defaulters identification. We have introduced three classifiers which are Kstar, SMO and Multilayer perceptron and repeat the process of classification and feature selection for every classifier. First, we apply feature selection techniques to our dataset with each classifier to find out possible optimal parameters and In the next phase, we use classification to find the impact of possible optimal parameters and proved our findings. In each round of classification, we have used different parameters available in the dataset every time we include and exclude some parameters and noted the results of each run of classification with each classifier and in this way, we identify the optimal parameters and their impact on the results Whereas we also analyze the performance of classifiers. To perform this research study, we use the “credit card defaults” dataset which we obtained from UCI Machine learning online repository. We use two feature selection techniques that include ranker approach and evolutionary search method and after that, we also apply classification techniques on the dataset. This research can help to reduce the complexities of the credit scoring process. Through this study, we identify up to six optimal parameters and also find their impact on the performance of classifiers. Further We also identify that multilayer perceptron was the best performing classifier out of three. This research work can also be extended to other fields in the future where we use this mechanism to find out optimal parameters and their impact can help us to predict the  results. 

 

Published
2022-03-30
Section
Articles