Skip to main content
← Risk Engine Validation & Stress Testing Report

Model Comparison Evidence

Risk engine benchmark results comparing GBM, GARCH, and Historical simulation models. Supports Tests 2.1, 7.1, 7.2, 7.3a, and 7.3b. For term definitions, see the Glossary. For inputs and timestamps, see Data Provenance.

All results from production risk engine March 2026 release, 2026-03-26/27.

Three-Model Architecture

The risk engine supports three simulation approaches via the same /loan_risk endpoint. All three use the same loan state machine (nominal, partial liquidation, closeout) — they differ only in how price paths are generated.

ModelPrice Path GenerationBest For
GBMGeometric Brownian Motion with Cholesky-correlated paths and optional volatility shock multiplierStress testing — tail events scaled by volatility multiplier
GARCHBeta-GARCH with market-factor residual modeling, volatility clustering, BIC-selected lag structureVolatility clustering — captures periods where high volatility persists and produces fatter tails than constant-volatility models
HistoricalNon-parametric sliding window replay of actual price historyCaptures historical price movements and crashes without distributional assumptions

Scoring from Simulation Output

The Extreme Event Resilience score decreases proportionally with Expected Shortfall from the lender's perspective — higher ES means greater potential capital impairment for the lender, not just borrower liquidation. A score of 1.00 means negligible lender loss. A score of 0.50 means the lender's tail-risk Expected Shortfall is 50% of the loan principal. This is distinct from Pr(Liquidation) — a pool can have high liquidation probability but near-zero lender loss if collateral ratios provide sufficient buffer.

Model Comparison: wstETH/USDC (Tests 7.1, 7.2, 7.3a)

Configuration

See Data Provenance — Model Comparison Input for full JSON payloads, Parameter Mapping for L/N to LTV conversion, and execution timestamps.

ParameterGBM / GARCHHistorical
CollateralwstETHwstETH
LoanUSDCUSDC
LTV66.7%66.7%
Liquidation LTV87.0%87.0%
Loan duration365 days365 days
MC iterations10,000N/A (sliding window)
Volatility stressvolatility_shock = 9.0 → scales vol by (1 + 9.0) = 10xN/A
Lookback365 days730 days (~366 daily windows)

Simulation Outcomes

MetricGBM (stressed)GBM (no shock)GARCHHistorical
Pr(Liquidation)>99%68.8%93.9%98.9%
Avg Lender Loss<1%<1%<1%<1%

Expected Shortfall by Quantile

QuantileGBM (stressed)GBM (no shock)GARCHHistorical
LossES @99%28.65%<1%<1%<1%
LossES @99.9%53.21%<1%<1%<1%
LossES @99.99%64.81%<1%<1%<1%

Interpretation

GBM with volatility stress is the most conservative model by design. The stress multiplier scales the historical volatility estimates, producing simulated return distributions with significantly wider dispersion than historically observed. Under this stress, even well-collateralized positions face closeout, producing significant lender losses (64.8% ES at 99.99th percentile).

GBM without volatility stress produces near-zero lender loss — same as GARCH and Historical. This confirms that the volatility stress is the sole driver of extreme tail loss. Without it, GBM behaves comparably to the other models for this collateral configuration.

GARCH captures volatility clustering dynamics. The Beta-GARCH model produces fatter tails than constant-volatility GBM. Under these dynamics, 93.9% of simulations trigger liquidation events, but collateral ratios remain adequate and lender losses stay <1%.

Historical replays actual market conditions. The sliding window replay uses a 730-day lookback (2 years of daily price data), producing ~366 overlapping windows of 365-day loan simulations. 98.9% of windows triggered liquidation events. Worst-case CCR across all windows was 1.014 — lender losses remained <1%.

Stress Event Sensitivity: October 10th wstETH

To demonstrate the Historical model's sensitivity to specific stress events, we ran two simulations with identical parameters (wstETH/USDC, LTV=66.7%, Liquidation LTV=87.0%, 730-day lookback) but different analysis dates. For input payloads, see Data Provenance — Historical Stress Event.

SimulationAnalysis DateLookback WindowPr(Liquidation)Worst CCRLender ES
Includes Oct 102026-03-272024-03-28 → 2026-03-2798.9%1.014<1%
Excludes Oct 102025-10-092023-10-10 → 2025-10-0963.7%1.014<1%

Including the period around October 10th increases liquidation probability from 63.7% to 98.9%. The jump reflects not a single event but a broader period of poor market conditions — empirically verified through sustained price deterioration and elevated volatility across that window. The sliding windows that overlap with this stress sequence nearly all trigger liquidation events. The collateral ratios hold in both cases (worst CCR stays above 1.0), but the model clearly captures how prolonged market deterioration compounds liquidation risk. This confirms the Historical model responds to real market dynamics without distributional assumptions — its output changes when the data changes.

GBM without stress vs GARCH: GBM without volatility stress shows 68.8% liquidation probability, while GARCH shows 93.9%. GARCH captures volatility clustering that the constant-volatility GBM misses, producing more liquidation events under realistic dynamics. Both produce <1% lender loss at this LTV, so the difference surfaces in liquidation frequency rather than tail loss. The stressed GBM serves a different purpose — it models tail scenarios that lookback-calibrated models would miss because they haven't observed them yet (e.g., multi-sigma events like the 2022 LUNA/UST crash or March 2020 COVID drawdown).

Stress Period Sensitivity: GARCH vs GBM (Test 10.1)

8 simulation runs on wstETH/USDC comparing GBM (no shock) and GARCH with and without the October 10, 2025 crash in the lookback window. Two LTV configurations: production (66.7%) and stressed (80%). For input payloads, see Data Provenance — Stress Sensitivity.

LTV 66.7% (Production: N=1.5, L=1.15)

ModelPeriodPr(Liquidation)Lender ES @99.99%Worst CCR
GBMIncludes Oct 1054.4%<1%0.999
GARCHIncludes Oct 1093.1%<1%0.988
GBMExcludes Oct 1045.0%<1%1.014
GARCHExcludes Oct 1093.7%<1%0.964

LTV 80% (Stressed: N=1.25, L=1.1)

ModelPeriodPr(Liquidation)Lender ES @99.99%Worst CCR
GBMIncludes Oct 1074.0%<1%0.961
GARCHIncludes Oct 1096.5%<1%0.942
GBMExcludes Oct 1066.2%<1%0.945
GARCHExcludes Oct 1096.6%<1%0.930

GARCH produces 71% more liquidation events than GBM at production LTV (93.1% vs 54.4%) when the October crash is in the lookback. The October 10 stress event raises GBM liquidation probability by 9-12 percentage points, while GARCH is less sensitive to the specific event because it already captures volatility clustering from broader market dynamics. Lender ES remains <1% at both LTV configurations because the liquidation mechanism recovers capital — the collateral buffer absorbs the stress. The ES differential materializes only at LTV configurations where the collateral buffer is insufficient, which represents a protocol design failure rather than normal operations.

Multi-Pool GBM Results

Four pools scored with production GBM configuration (stressed, 10k MC, 365-day). For input payloads and per-pool parameters, see Data Provenance — Multi-Pool Input.

PoolLTVLiq LTVES @99%ES @99.9%ES @99.99%Score
ETH/USDC58.8%76.9%<1%43.83%62.87%0.371
BTC/USDC58.8%76.9%<1%<1%<1%1.000
cbBTC/USDC58.8%76.9%<1%<1%<1%1.000
wstETH/USDC66.7%87.0%34.22%60.33%69.62%0.304

Observations:

  • BTC and cbBTC both show <1% lender loss at all quantiles. The 58.8% LTV and 76.9% Liquidation LTV provide sufficient buffer — borrowers get liquidated (96.4% and 96.0%), but lenders are fully protected.
  • ETH with the same conservative LTV (58.8%) shows moderate tail loss at the extreme quantiles.
  • wstETH with higher LTV (66.7%) and tighter Liquidation LTV (87.0%) shows the highest tail loss — less buffer means the stress can overwhelm the liquidation mechanism.

Native vs Wrapped Comparison

Same standardized parameters (LTV=58.8%, Liquidation LTV=76.9%, stressed) applied to all four assets for a clean comparison where only the asset differs. Uses the same input payloads as Multi-Pool with uniform L/N values.

AssetES @99%ES @99.9%ES @99.99%Pr(Liquidation)
ETH<1%39.46%59.83%>99%
wstETH<1%43.28%59.33%>99%
BTC<1%<1%<1%96.4%
cbBTC<1%<1%<1%96.1%

Key finding: At identical LTV parameters, ETH and wstETH produce nearly identical tail loss (59.83% vs 59.33% at 99.99th). BTC and cbBTC both produce near-zero lender loss. The wrapped versions behave consistently with their native counterparts, confirming the risk engine correctly prices wrapped assets from their own historical price series.

Convergence Test (Test 2.1)

10 parallel GBM runs on ETH/USDC with identical parameters (stressed, LTV=58.8%, Liquidation LTV=76.9%) at each iteration count. For input payload and execution timestamps (10k, 100k, 1M), see Data Provenance — Convergence Input.

Convergence by Quantile — 10k MC iterations

QuantileMean ESStd DevCV
99th0.150.0211.9%
99.9th45.742.776.1%
99.99th58.195.509.5%

Convergence by Quantile — 100k MC iterations

Same pool and parameters, 10x more simulation paths:

QuantileMean ESStd DevCV
99th0.150.015.4%
99.9th45.931.493.2%
99.99th61.022.343.8%

Convergence by Quantile — 1M MC iterations

Same pool and parameters, 100x more simulation paths than 10k.

QuantileMean ESStd DevCV
99th0.150.0010.78%
99.9th45.730.280.62%
99.99th61.050.641.05%

Convergence Comparison Across Iteration Counts

QuantileCV @ 10kCV @ 100kCV @ 1M
99th11.9%5.4%0.78%
99.9th6.1%3.2%0.62%
99.99th9.5%3.8%1.05%

At 1M iterations, all three quantiles converge under 2% CV. The 99.99th percentile drops from 9.5% (10k) to 1.05% (1M).

Interpretation

The production system currently runs at 10k MC iterations, which achieves 6.1% CV at the 99.9th percentile. At the 99.99th percentile (the scoring design point), CV is 9.5% — above the 2% threshold.

If CV < 2% is a mandatory requirement, the system supports it: at 1M iterations, the 99.99th percentile converges to 1.05% CV. This is a compute cost tradeoff, not a model limitation — the system supports configurable iteration counts and quantiles.

Correlation Test (Test 7.3b)

Two GBM simulations with identical parameters (stressed, LTV=66.7%, Liquidation LTV=87.0%) except collateral composition. For input payloads (correlated and uncorrelated pairs), see Data Provenance — Correlation Test.

ConfigurationCollateralES @99%ES @99.9%ES @99.99%
Correlated pairwstETH (50%) + cbETH (50%)31.88%54.42%71.18%
Uncorrelated pairBTC (50%) + DAI (50%)<1%<1%<1%

Correlated ES is significantly higher. The GBM model generates price paths using Cholesky decomposition of the historical covariance matrix. wstETH and cbETH (both ETH liquid staking derivatives) are highly correlated — when one drops, the other drops with it. The diversification benefit is minimal, and the correlated decline can overwhelm the liquidation mechanism.

BTC + DAI (a stablecoin) are effectively uncorrelated. DAI maintains peg while BTC moves, providing genuine diversification. Even under stress, the mixed collateral produces <1% lender loss.