Using machine learning to estimate the probability of default in credit risk is becoming a popular approach. Bayesian methodologies offer a probabilistic interpretation of model predictions and prevent overfitting, a significant flaw in numerous machine learning models. However, Bayesian inference based on Monte Carlo Markov Chain (MCMC) algorithms comes with high computational costs. For credit scoring models efficiency and performance are equally important features. We pro- pose two machine learning architectures based on Stochastic Gradient Langevin Dynamics (SGLD) to estimate the probability of default of loan applicants. This framework (i) allows us to sample from the true posterior without relying on typical MCMC algorithms, (ii) it is not computationally expensive and (iii) it leverages the strength of Bayesian approaches, such as the flexibility to regularization. We apply this method to Bayesian Logistic Regression and Bayesian Neural Network. Furthermore, we perform a benchmarking analysis with different models and regularization techniques on four large retail loan datasets. We also address model explainability with the model-agnostic method of Shapley Additive Explanation (SHAP).
Bayesian probability of default models with Langevin dynamics / Morelli, Giacomo; Conti, Andrea. - In: QUANTITATIVE FINANCE. - ISSN 1469-7688. - (2025), pp. 1-9.
Bayesian probability of default models with Langevin dynamics
Giacomo Morelli;CONTI, ANDREA
2025
Abstract
Using machine learning to estimate the probability of default in credit risk is becoming a popular approach. Bayesian methodologies offer a probabilistic interpretation of model predictions and prevent overfitting, a significant flaw in numerous machine learning models. However, Bayesian inference based on Monte Carlo Markov Chain (MCMC) algorithms comes with high computational costs. For credit scoring models efficiency and performance are equally important features. We pro- pose two machine learning architectures based on Stochastic Gradient Langevin Dynamics (SGLD) to estimate the probability of default of loan applicants. This framework (i) allows us to sample from the true posterior without relying on typical MCMC algorithms, (ii) it is not computationally expensive and (iii) it leverages the strength of Bayesian approaches, such as the flexibility to regularization. We apply this method to Bayesian Logistic Regression and Bayesian Neural Network. Furthermore, we perform a benchmarking analysis with different models and regularization techniques on four large retail loan datasets. We also address model explainability with the model-agnostic method of Shapley Additive Explanation (SHAP).| File | Dimensione | Formato | |
|---|---|---|---|
|
Conti_bayesian-probability_2025.pdf
solo gestori archivio
Note: Paper originale
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.01 MB
Formato
Adobe PDF
|
1.01 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


