Peer-to-peer loan acceptance and default prediction with artificial intelligence

R Soc Open Sci. 2020 Jun 10;7(6):191649. doi: 10.1098/rsos.191649. eCollection 2020 Jun.

Abstract

Logistic regression (LR) and support vector machine algorithms, together with linear and nonlinear deep neural networks (DNNs), are applied to lending data in order to replicate lender acceptance of loans and predict the likelihood of default of issued loans. A two-phase model is proposed; the first phase predicts loan rejection, while the second one predicts default risk for approved loans. LR was found to be the best performer for the first phase, with test set recall macro score of 77.4 % . DNNs were applied to the second phase only, where they achieved best performance, with test set recall score of 72 % , for defaults. This shows that artificial intelligence can improve current credit risk models reducing the default risk of issued loans by as much as 70 % . The models were also applied to loans taken for small businesses alone. The first phase of the model performs significantly better when trained on the whole dataset. Instead, the second phase performs significantly better when trained on the small business subset. This suggests a potential discrepancy between how these loans are screened and how they should be analysed in terms of default prediction.

Keywords: artificial intelligence; big data; default risk; financial automation; peer-to-peer lending.

Associated data

  • Dryad/10.5061/dryad.qbzkh18cq