π OCP Statistical Arbitrage on S&P 100
π EPFL β Master in Financial Engineering, Year 2 (2025)
π₯ Team: Matthias Wyss, Lina Sadgal, Yassine Wahidy
π Final Report: Report
π GitHub Repository: GitHub
This project implements an advanced statistical arbitrage framework based on the Optimal Causal Path (OCP) algorithm. Unlike traditional correlation-based approaches, OCP utilizes dynamic programming to detect non-linear and time-varying βLeader-Followerβ relationships in high-frequency data.
We analyze the S&P 100 constituents over the 2015β2017 period, leveraging Best Bid and Offer (BBO) data to exploit market micro-inefficiencies through elastic time-alignment.
π Strategy Components
- Lead-Lag Detection (OCP): Identifying stock pairs where a βLeaderβsβ price movement predicts the βFollowerβsβ future returns by finding the path of lowest total cost in a time-warping grid.
- Signal Generation: Monitoring idiosyncratic shocks via Bollinger Bands (k=2.5) combined with a minimum economic threshold (4 bps).
- Market-Neutral Execution: Implementing a hedged strategy (Long Follower / Short SPY) to isolate alpha and neutralize systematic market risk.
- Dynamic Reaction Window: Tailoring the holding period based on the causal stability ($\sigma_l$) derived from the OCP path.
π Performance & Risk Analysis
The results provide strong empirical evidence of predictive causal links, validating the core hypothesis that information transmission delays exist even in highly liquid markets.
| Metric | Gross Value |
|---|---|
| Total Return | 34.49% |
| Annualized Sharpe Ratio | 1.73 |
| Max Drawdown | -5.39% |
| Total Trades | 9,969 |
Execution Challenge: While the gross alpha is significant, sensitivity analysis reveals that net profitability is highly dependent on transaction costs. Filtering for high-conviction signals (increasing the threshold to 15 bps) is essential to overcome the bid-ask spread.
π Tools & Libraries:
- Python & Polars: Used for efficient processing of the 55GB raw BBO dataset.
- Jupyter Notebooks: For research and backtesting visualization.
- Pandas Market Calendars: To ensure precise alignment with NASDAQ/NYSE trading hours.
π§ Techniques:
- Statistical Arbitrage (Pairs Trading)
- Dynamic Programming (OCP Algorithm)
- High-Frequency Backtesting
- Sensitivity & Risk Management