Digest Finance

Data Mining Techniques: Modern Approaches to Application in Credit Scoring

Vol. 22, Iss. 4, DECEMBER 2017

Received: 4 July 2017

Received in revised form: 9 August 2017

Accepted: 24 August 2017

Available online: 19 December 2017

Subject Heading: Banking

JEL Classification: C38, C55, D81

Pages: 400Ц412


Volkova V.S. Financial University under Government of Russian Federation, Moscow, Russian Federation EVolkova@fa.ru

Gisin V.B. Financial University under Government of Russian Federation, Moscow, Russian Federation VGisin@fa.ru

Solov'ev V.I. Financial University under Government of Russian Federation, Moscow, Russian Federation VSoloviev@fa.ru

Importance This article examines the current state of research in machine learning and data mining, which computational methods get combined with conventional lending models such as scoring, for instance.
Objectives The article aims to classify the modern methods of credit scoring and describe models for comparing the effectiveness of the various methods of credit scoring.
Methods To perform the tasks, we have studied relevant scientific publications on the article subject presented in Google Scholar.
Results The article presents a classification of modern data mining techniques used in credit scoring.
Conclusions and Relevance Credit scoring models using machine learning procedures and hybrid models using combined methods can provide the required level of efficiency in the modern environment.

Keywords: loan scoring, credit score, machine learning, data mining


  1. Durand D. Risk Elements in Consumer Installment Financing. New York, National Bureau of Economic Research Books, 1941, 163 p.
  2. Hand D.J., Henley W.E. Statistical Classification Methods in Consumer Credit Scoring: A Review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 1997, vol. 160, iss. 3, pp. 523Ц541. URL: https://doi.org/10.1111/j.1467-985X.1997.00078.x
  3. García V., Marqués A.I., Sánchez J.S. An Insight into the Experimental Design for Credit Risk and Corporate Bankruptcy Prediction Systems. Journal of Intelligent Information Systems, 2015, vol. 44, iss. 1, pp. 159Ц189. URL: https://doi.org/10.1007/s10844-014-0333-4
  4. Lessmann S., Seow H.-V., Baesens B., Thomas L.C. Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring: An Update of Research. European Journal of Operational Research, 2015, vol. 247, iss. 1, pp. 124Ц136. URL: https://doi.org/10.1016/j.ejor.2015.05.030
  5. Hand D.J., Kelly M.G. Superscorecards. IMA Journal of Management Mathematics, 2002, vol. 13, iss. 4, pp. 273Ц281.
  6. Yap B.W., Ong S.H., Husain N.H.M. Using Data Mining to Improve Assessment of Credit Worthiness via Credit Scoring Models. Expert Systems with Applications, 2011, vol. 38, iss. 10, pp. 13274Ц13283. URL: https://doi.org/10.1016/j.eswa.2011.04.147
  7. Pavlidis N.G., Tasoulis D.K., Adams N.M., Hand D.J. Adaptive Consumer Credit Classification. Journal of the Operational Research Society, 2012, vol. 63, iss. 12, pp. 1645Ц1654. URL: https://doi.org/10.1057/jors.2012.15
  8. Khemais Z., Nesrine D., Mohamed M. Credit Scoring and Default Risk Prediction: A Comparative Study between Discriminant Analysis & Logistic Regression. International Journal of Economics and Finance, 2016, vol. 8, iss. 4, pp. 39Ц53. URL: http://dx.doi.org/10.5539/ijef.v8n4p39
  9. Louzada F., Anacleto-Junior O., Candolo C., Mazucheli J. Poly-bagging Predictors for Classification Modelling for Credit Scoring. Expert Systems with Applications, 2011, vol. 38, iss. 10, pp. 12717Ц12720. URL: https://doi.org/10.1016/j.eswa.2011.04.059
  10. Li Z., Tianb Y., Li K. et al. Reject Inference in Credit Scoring Using Semi-supervised Support Vector Machines. Expert Systems with Applications, 2017, vol. 74, pp. 105Ц114. URL: https://doi.org/10.1016/j.eswa.2017.01.011
  11. Fisher R.A. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 1936, vol. 7, iss. 2, pp. 179Ц188. URL: https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
  12. Eisenbeis R.A. Problems in Applying Discriminant Analysis in Credit Scoring Models. Journal of Banking & Finance, 1978, vol. 2, iss. 3, pp. 205Ц219. URL: https://doi.org/10.1016/0378-4266(78)90012-2
  13. Mylonakis J., Diacogiannis G. Evaluating the Likelihood of Using Linear Discriminant Analysis as a Commercial Bank Card Owners Credit Scoring Model. International Business Research, 2010, vol. 3, no. 2, pp. 9Ц20. URL: https://doi.org/10.5539/ibr.v3n2p9
  14. Akkoç S. An Empirical Comparison of Conventional Techniques, Neural Networks and the Three Stage Hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) Model for Credit Scoring Analysis: The Case of Turkish Credit Card Data. European Journal of Operational Research, 2012, vol. 222, iss. 1, pp. 168Ц178. URL: https://doi.org/10.1016/j.ejor.2012.04.009
  15. Falangis K., Glen J.J. Heuristics for Feature Selection in Mathematical Programming Discriminant Analysis Models. Journal of the Operational Research Society, 2010, vol. 61, no. 5, pp. 804Ц812. URL: https://doi.org/10.1057/jors.2009.24
  16. Breiman L., Friedman J., Stone C.J., Olshen R.A. Classification and Regression Trees. Monterey, CA, Wadsworth & Brooks/Cole Advanced Books & Software, 1984, 368 p.
  17. Loh W.-Y. Fifty Years of Classification and Regression Trees. International Statistical Review, 2014, vol. 82, iss. 3, pp. 329Ц348. URL: https://doi.org/10.1111/insr.12016
  18. Finlay S. Multiple Classifier Architectures and Their Application to Credit Risk Assessment. European Journal of Operational Research, 2011, vol. 210, iss. 2, pp. 368Ц378. URL: http://dx.doi.org/10.1016/j.ejor.2010.09.029
  19. Zhang D., Zhou X., Leung S.C.H., Zheng J. Vertical Bagging Decision Trees Model for Credit Scoring. Expert Systems with Applications, 2010, vol. 37, iss. 12, pp. 7838Ц7843. URL: https://doi.org/10.1016/j.eswa.2010.04.054
  20. Hu Q., Che X., Zhang L. et al. Rank Entropy-Based Decision Trees for Monotonic Classification. IEEE Transactions on Knowledge and Data Engineering, 2012, vol. 24, iss. 11, pp. 2052Ц2064. URL: https://doi.org/10.1109/TKDE.2011.149
  21. Hayashi Y., Tanaka Y., Takagi T. et al. Recursive-Rule Extraction Algorithm with J48graft and Applications to Generating Credit Scores. Journal of Artificial Intelligence and Soft Computing Research, 2016, vol. 6, iss. 1, pp. 35Ц44. URL: https://doi.org/10.1515/jaiscr-2016-0004
  22. Vapnik V.N. Statistical Learning Theory. New York, John Wiley, 1998, 768 p.
  23. Bellotti T., Crook J. Support Vector Machines for Credit Scoring and Discovery of Significant Features. Expert Systems with Applications, 2009, vol. 36, iss. 2-2, pp. 3302Ц3308. URL: https://doi.org/10.1016/j.eswa.2008.01.005
  24. Chen W., Ma C., Ma L. Mining the Customer Credit Using Hybrid Support Vector Machine Technique. Expert Systems with Applications, 2009, vol. 36, iss. 4, pp. 7611Ц7616. URL: https://doi.org/10.1016/j.eswa.2008.09.054
  25. Ling Y., Cao Q., Zhang H. Credit Scoring Using Multi-Kernel Support Vector Machine and Chaos Particle Swarm Optimization. International Journal of Computational Intelligence and Applications, 2012, vol. 11, iss. 3, pp. 12500198:1Ц12500198:13.
  26. Friedman N., Geiger D., Goldszmidt M. Bayesian Network Classifiers. Machine Learning, 1997, vol. 29, iss. 2-3, pp. 131Ц163. URL: https://doi.org/10.1023/A:1007465528199
  27. Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988, 552 p.
  28. Giudici P. Bayesian Data Mining, with Application to Benchmarking and Credit Scoring. Applied Stochastic Models in Business and Industry, 2001, vol. 17, iss. 1, pp. 69Ц81. URL: https://doi.org/10.1002/asmb.425
  29. Gemela J. Financial Analysis Using Bayesian Networks. Applied Stochastic Models in Business and Industry, 2001, vol. 17, iss. 1, pp. 57Ц67. URL: https://doi.org/10.1002/asmb.422
  30. Antonakis A.C., Sfakianakis M.E. Naïve Bayes as a Means of Constructing Application Scorecards. In: L. Moutinho and K.-H. Huarng (eds), Advances in Doctoral Research in Management. Singapore, World Scientific Publishing Co. Pte. Ltd, 2008, vol. 2, pp. 47Ц62.
  31. Antonakis A.C., Sfakianakis M.E. Assessing Naïve Bayes as a Method for Screening Credit Applicants. Journal of Applied Statistics, 2009, vol. 36, iss. 5-6, pp. 537Ц545. URL: https://doi.org/10.1080/02664760802554263
  32. Wu W.-W. Improving Classification Accuracy and Causal Knowledge for Better Credit Decisions. International Journal of Neural Systems, 2011, vol. 21, iss. 4, pp. 297Ц309. URL: https://doi.org/10.1142/S0129065711002845
  33. Zhu H., Beling P.A., Overstreet G.A. A Bayesian Framework for the Combination of Classifier Outputs. Journal of the Operational Research Society, 2002, vol. 53, iss. 7, pp. 719Ц727. URL: https://doi.org/10.1057/palgrave.jors.2601262
  34. West D. Neural Network Credit Scoring Models. Computers & Operations Research, 2000, vol. 27, iss. 11-12, pp. 1131Ц1152. URL: https://doi.org/10.1016/S0305-0548(99)00149-5
  35. Ong C.-S., Huang J.-J., Tzeng G.-H. Building Credit Scoring Models Using Genetic Programming. Expert Systems with Applications, 2005, vol. 29, iss. 1, pp. 41Ц47. URL: https://doi.org/10.1016/j.eswa.2005.01.003
  36. Breiman L. Bagging Predictors. Machine Learning, 1996, vol. 24, iss. 2, pp. 123Ц140. URL: https://doi.org/10.1007/BF00058655
  37. Wolpert D.H. Stacked Generalization. Neural Networks, 1992, vol. 5, no. 2, pp. 241Ц259.
  38. Vukovic S., Delibašić B., Uzelac A., Suknovic M. A Case-Based Reasoning Model That Uses Preference Theory Functions for Credit Scoring. Expert Systems with Applications, 2012, vol. 39, iss. 9, pp. 8389Ц8395. URL: https://doi.org/10.1016/j.eswa.2012.01.181
  39. Marqués A.I., García V., Sánchez J.S. Two-Level Classifier Ensembles for Credit Risk Assessment. Expert Systems with Applications, 2012, vol. 39, iss. 12, pp. 10916Ц10922. URL: https://doi.org/10.1016/j.eswa.2012.03.033
  40. Hoffmann F., Baesens B., Mues C. et al. Inferring Descriptive and Approximate Fuzzy Rules for Credit Scoring Using Evolutionary Algorithms. European Journal of Operational Research, 2007, vol. 177, iss. 1, pp. 540Ц555. URL: https://doi.org/10.1016/j.ejor.2005.09.044
  41. Ignatius J., Hatami-Marbini A., Rahman A. et al. A Fuzzy Decision Support System for Credit Scoring. Neural Computing and Applications, 2016, vol. 27, no. 1, pp. 1Ц17. URL: https://doi.org/10.1007/s00521-016-2592-1
  42. Lahsasna A., Ainon R.N., Wah T.Y. Credit Risk Evaluation Decision Modeling Through Optimized Fuzzy Classifier. Proc. International Symposium on Information Technology, 2008. IEEE, 2008, vol. 1, pp. 1Ц8.
  43. Kaur A. et al. Fuzzy Rule-based Expert System for Evaluating Defaulter Risk in Banking Sector. Indian Journal of Science and Technology, 2016, vol. 9, iss. 28, pp. 1Ц6. URL: https://doi.org/10.17485/ijst/2016/v9i28/98395
  44. Malhotra R., Malhotra D.K. Differentiating Between Good Credits and Bad Credits Using Neuro-Fuzzy Systems. European Journal of Operational Research, 2002, vol. 136, iss. 1, pp. 190Ц211. URL: https://doi.org/10.1016/S0377-2217(01)00052-2
  45. Clemen R.T., Murphy A.H., Winkler R.L. Screening Probability Forecasts: Contrasts Between Choosing and Combining. International Journal of Forecasting, 1995, vol. 11, iss. 1, pp. 133Ц145. URL: https://doi.org/10.1016/0169-2070(94)02007-C
  46. DeGroot M.H., Fienberg S.E. The Comparison and Evaluation of Forecasters. Journal of the Royal Statistical Society. Series D (The Statistician), 1983, vol. 32, no. 1/2, pp. 12Ц22. Stable URL: http://www.jstor.org/stable/2987588
  47. DeGroot M.H., Eriksson E.A. Probability Forecasting, Stochastic Dominance, and the Lorenz Curve. J.M. Bernardo, M.H. DeGroot, D.V. Lindley and A.F.M. Smith (eds). Amsterdam, North-Holland, Bayesian Statistics, 1985, vol. 2, pp. 99Ц118.

View all articles of issue


ISSN 2311-9438 (Online)
ISSN 2073-8005 (Print)

Journal current issue

Vol. 22, Iss. 4
December 2017