Using machine learning to estimate the probability of default in credit risk is becoming a popular approach. Bayesian methodologies offer a probabilistic interpretation of model predictions and prevent overfitting, a significant flaw in numerous machine learning models. However, Bayesian inference based on Monte Carlo Markov Chain (MCMC) algorithms comes with high computational costs. For credit scoring models efficiency and performance are equally important features. We pro- pose two machine learning architectures based on Stochastic Gradient Langevin Dynamics (SGLD) to estimate the probability of default of loan applicants. This framework (i) allows us to sample from the true posterior without relying on typical MCMC algorithms, (ii) it is not computationally expensive and (iii) it leverages the strength of Bayesian approaches, such as the flexibility to regularization. We apply this method to Bayesian Logistic Regression and Bayesian Neural Network. Furthermore, we perform a benchmarking analysis with different models and regularization techniques on four large retail loan datasets. We also address model explainability with the model-agnostic method of Shapley Additive Explanation (SHAP).

Bayesian probability of default models with Langevin dynamics / Morelli, Giacomo; Conti, Andrea. - In: QUANTITATIVE FINANCE. - ISSN 1469-7688. - (2025), pp. 1-9.

Bayesian probability of default models with Langevin dynamics

Giacomo Morelli;CONTI, ANDREA
2025

Abstract

Using machine learning to estimate the probability of default in credit risk is becoming a popular approach. Bayesian methodologies offer a probabilistic interpretation of model predictions and prevent overfitting, a significant flaw in numerous machine learning models. However, Bayesian inference based on Monte Carlo Markov Chain (MCMC) algorithms comes with high computational costs. For credit scoring models efficiency and performance are equally important features. We pro- pose two machine learning architectures based on Stochastic Gradient Langevin Dynamics (SGLD) to estimate the probability of default of loan applicants. This framework (i) allows us to sample from the true posterior without relying on typical MCMC algorithms, (ii) it is not computationally expensive and (iii) it leverages the strength of Bayesian approaches, such as the flexibility to regularization. We apply this method to Bayesian Logistic Regression and Bayesian Neural Network. Furthermore, we perform a benchmarking analysis with different models and regularization techniques on four large retail loan datasets. We also address model explainability with the model-agnostic method of Shapley Additive Explanation (SHAP).
2025
probability of default; stochastic gradient Langevin dynamics; bayesian neural network; bayesian inference; MCMC; machine learning
01 Pubblicazione su rivista::01a Articolo in rivista
Bayesian probability of default models with Langevin dynamics / Morelli, Giacomo; Conti, Andrea. - In: QUANTITATIVE FINANCE. - ISSN 1469-7688. - (2025), pp. 1-9.
File allegati a questo prodotto
File Dimensione Formato  
Conti_bayesian-probability_2025.pdf

solo gestori archivio

Note: Paper originale
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.01 MB
Formato Adobe PDF
1.01 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1751106
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact