💳 Credit Risk & Statistical Learning

📍 EPFL – Master in Financial Engineering, Year 2 (2025) 👥 Team: Matthias Wyss, William Jallot, Antoine Garin
📄 Final Report: Report
🔗 GitHub Repository: GitHub

This project was developed as part of the FIN-417: Quantitative Risk Management course at EPFL. The objective was to build and evaluate statistical learning models to predict default probabilities and assess the profitability of different lending strategies under risk constraints.

We analyze a dataset of retail borrowers (age, income, employment status) to estimate repayment probabilities and optimize bank returns using quantitative risk measures.

🔍 Strategy Components

Feature Engineering: Statistical analysis of borrower characteristics and identification of key risk drivers (e.g., debt-to-income ratio, employment stability).
Default Modeling: Comparative analysis of two main approaches:
- Logistic Regression: Used as a baseline for linear relationships.
- SVM with RBF Kernel: Implemented to capture complex non-linearities in borrower profiles (e.g., higher risk at both ends of the age spectrum).
Probability Calibration: Utilizing Platt Scaling to transform SVM decision boundary distances into calibrated, actionable probabilities.
Risk-Based Lending: Implementation of selective lending policies where credit is only extended to applicants with a predicted survival probability > 95%.

📊 Performance & Risk Analysis

The results demonstrate that the SVM model significantly outperforms logistic regression when dealing with non-linear risk distributions, leading to more stable portfolio returns.

Model	AUC (Non-linear Dataset)	Cross-Entropy Loss (Test)
Logistic Regression	0.862	0.1486
SVM (RBF Kernel)	0.980	0.0671

Portfolio Risk Metrics (SVM Selective Strategy):

By simulating 50,000 scenarios via Monte Carlo methods, we quantified the potential downside of the selective lending strategy.

Metric	Value
Expected PnL	CHF 63,057.76
95% Value-at-Risk (VaR) Loss	CHF -55,000.00
95% Expected Shortfall (ES) Loss	CHF -53,312.36

Key Finding: The SVM-based selective strategy effectively mitigates tail risk compared to a “Lend-to-All” approach, maintaining high profitability while strictly controlling the Expected Shortfall.

🛠 Tools & Libraries:

Python & Scikit-Learn: Model implementation, hyperparameter tuning, and performance metrics.
NumPy: High-performance Monte Carlo simulations for 50,000 default scenarios.
Matplotlib: Visualization of ROC curves and PnL distributions.

🧠 Techniques:

Supervised Learning (Classification)
Probability Calibration (Platt Scaling)
Quantitative Risk Management (VaR, ES)
Monte Carlo Simulations