Establishment of Business Loan Default Prediction Model by Integrating Survival Analysis with Logistic Regression

Document Type : Article


1 Department of Industrial Engineering and Management, National Chiao Tung University, Hsinchu 300, Taiwan

2 - Department of Management Sciences, R.O.C. Military Academy, Kaohsiung 830, Taiwan - Institute of Innovation and Circular Economy, Asia University, Taichung 413, Taiwan


An insufficient amount of capital conservation buffer would cause a financial institution to be unable to withstand fluctuations in the economic cycle; while an excessive amount would reduce the financial institution’s available funds, which would lead to a loss of the capital available for investment. In order to address this issue in an effective manner, the business loan default prediction model is established in this study by integrating survival analysis with logistic regression. In the section of case validation, the reliability of the proposed approach is validated with the information of businesses that have been granted loans by financial institutions in Taiwan, and the proposed approach was also compared with the Cox proportional hazards model approach, which is frequently applied by financial institutions. The empirical results demonstrate that the approach proposed in this study could predict a business loan default state closer to the actual default trend, and provide prediction results superior to that of the Cox proportional hazards model, thus, providing financial institutions with effective and reliable information for reference, which will allow them to prepare an appropriate amount of capital conservation buffer, and improve the capital flexibility of the financial institution.


  1. References


    1. Chang, Y.C., Chang, K.H., and Hsiao, C.W. “A novel credit risk assessment model using a granular computing technique”, Journal of Testing and Evaluation, 42(6), pp. 1427-1437 (2014).
    2. Croux, C., Jagtiani, J., Korivi, T., Vulanovic, M. “Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform”, Journal of Economic Behavior and Organization, 173, pp. 270-296 (2020).
    3. Basel Committee on Banking Supervision, “Basel III: The Liquidity Coverage Ratio and liquidity risk monitoring tools”, Basel Committee on Banking Supervision (2013).
    4. Noh, H.J., Roh, T.H., Han, I. “Prognostic personal credit risk model considering censored information”, Expert Systems with Applications, 28(4), pp. 753-762 (2005).
    5. Gepp, A., Kumar, K. “Predicting financial distress: A comparison of survival analysis and decision tree techniques”, Procedia Computer Science, 54, pp. 396-404 (2015).
    6. Gupta, J., Gregoriou, A., Ebrahimi, T. “Empirical comparison of hazard models in predicting SMEs failure”, Quantitative Finance, 18(3), pp. 437-466 (2018).
    7. Ng, G.S., Quek, C., Jiang, H. “FCMAC-EWS: A bank failure early warning system based on a novel localized pattern learning and semantically associative fuzzy neural network”, Expert Systems with Applications, 34(2), pp. 989-1003 (2008).
    8. Chang, Y.C., Chang, K.H., Chu, H.H. Tong, L.I., “Establishing decision tree-based short-term default credit risk assessment models”, Communications in Statistics - Theory and Methods, 45(23), pp. 6803-6815 (2016).
    9. Dirick, L., Claeskens, G., Baesens, B. “Time to default in credit scoring using survival analysis: a benchmark study”, Journal of the Operational Research Society, 68(6), pp. 652-665 (2017).
    10. Han, J.T., Choi, J.S., Kim, M.J., Jeong, J. “Developing a risk group predictive model for Korean students falling into bad debt”, Asian Economic Journal,, 32(1), pp. 3-14 (2018).
    11. Lippi, A., Barbieri, L., Poli, F. “Money transfer between banks Evidence regarding the factors affecting speed of portfolio transfer when advisors migrate”, International Journal of Bank Marketing, 38(2), pp. 283-295 (2020).
    12. Kocenda, E., Iwasaki, I. “Bank survival in Central and Eastern Europe”, International Review of Economics & Finance, 69, pp. 860-878 (2020).
    13. Yap, B.W., Ong, S.H., Husain, N.H.M. “Using data mining to improve assessment of credit worthiness via credit scoring models”, Expert Systems with Applications, 38(10), pp. 13274-13283 (2011).
    14. Liu, D., Li, T.R., Liang, D.C. “Incorporating logistic regression to decision-theoretic rough sets for classifications”, International Journal of Approximate Reasoning, 55(1), pp. 197-210 (2014).
    15. Zhang, S.Y., Tjortjis, C., Zeng, X.J, Qiao, H., Buchan, I., Keane, J. “Comparing data mining methods with logistic regression in childhood obesity prediction”, Information Systems Frontiers, 11(4), pp. 449-460 (2009).
    16. Cheng, C.J., Chiu, S.W., Cheng, C.B., Wu, J.Y. “Customer lifetime value prediction by a Markov chain based data mining model: Application to an auto repair and maintenance company in Taiwan”, Scientia Iranica, 19(3), pp. 849-855 (2012).
    17. Sheets, L., Petroski, G.F., Zhuang, Y., Phinney, M.A., Ge, B., Parker, J.C., Shyu, C.R. “Combining contrast mining with logistic regression to predict healthcare utilization in a managed care population”, Applied Clinical Informatics, 8(2), pp. 430-446 (2017).
    18. Le, T.H.M., Tran, T.T., Huynh, L.K. “Identification of hindered internal rotational mode for complex chemical species: A data mining approach with multivariate logistic regression model”, Chemometrics and Intelligent Laboratory Systems, 172, pp. 10-16 (2018).
    19. Chen, W., Yan, X.S., Zhao, Z., Hong, H.Y., Bui, D.T., Pradhan, B. “Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China)”, Bulletin of Engineering Geology and the Environment, 78(1), pp. 247-266 (2019).
    20. de Bem, P.P., de Carvalho, O.A., Matricardi, E.A.T., Guimaraes, R.F., Gomes, R.A.T. “Predicting wildfire vulnerability using logistic regression and artificial neural networks: a case study in Brazil's Federal District”, International Journal of Wildland Fire, 28(1), pp. 35-45 (2019).
    21. Najafi-Ghobadi, S., Najafi-Ghobadi, K., Tapak, L., Aghaei, A. “Application of data mining techniques and logistic regression to model drug use transition to injection: a case study in drug use treatment centers in Kermanshah Province, Iran”, Substance Abuse Treatment, Prevention, and Policy, 14(1), Article Number: 55 (2019).
    22. Kazerouni, F., Bayani, A., Asadi, F., Saeidi, L., Parvizi, N., Mansoori, Z. “Type2 diabetes mellitus prediction using data mining algorithms based on the long-noncoding RNAs expression: a comparison of four data mining approaches”, BMC Bioinformatics, 21(1), Article Number: 372 (2020).
    23. Nalic, J., Martinovic, G., Zagar, D. “New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers”, Advanced Engineering Informatics, 45, Article Number: 101130 (2020).
    24. Chang, Y.C., Chang, K.H., Wu. G.J. “Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions”, Applied Soft Computing, 73, pp. 914-920 (2018).
    25. Maleki, M.R., Amiri, A., Taheriyoun, A.R. “Identifying the time of step change and drift in phase II monitoring of autocorrelated logistic regression profiles”, Scientia Iranica, 25(6), pp. 3654-3666 (2018).
    26. Kleinbaum, D.G., Klein, M. “Survival analysis”, New York: Springer (2010).
    27. Berkson, J. “Application of the logistic function to bio-assay”, Journal of the American Statistical Association, 39(227), pp. 357-365 (1944).
    28. Stevens, J. “Applied multivariate statistics for the social science”, New Jersey: Lawrence Erlbaum, (1996).
    29. Cox, D.R. “Regression models and life-tables”, Journal of the Royal Statistical Society, Series B, 34(2), pp. 187-222 (1972).