Skip to main content
← Risk Engine Validation & Stress Testing Report

API Latency Benchmark Evidence

Production-verified per-request latency for the risk engine API under sustained load. Supports Test 11.1. For term definitions, see the Glossary. For data sources, see Data Provenance.

Benchmark Configuration

1,000 requests to the /loan_risk endpoint using the production risk engine deployment. Each request uses the same Monte Carlo parameters as the production scoring pipeline (Data Provenance).

ParameterValue
Requests1,000
ModelGBM (stressed)
MC iterations10,000
Model lookback90 days
Loan duration365 days
Volatility shock9.0 (10x stress)
mc_top_up50,000
Quantiles0.99, 0.999, 0.9999
CollateralBTC (100%)
LoanUSDC (100%)

Latency Results (Test 11.1)

MetricLatencyAcceptance
p504.581s< 12s
p906.784s< 12s
p997.444s< 12s PASS

1,000 of 1,000 requests completed successfully (0 failures). All requests used the production Monte Carlo configuration (10,000 iterations), matching the convergence parameters validated in Test 2.1.

Scalability

The benchmark measures per-request compute latency under sustained load. The risk engine is deployed on horizontally scalable infrastructure. Concurrent throughput scales linearly with the number of instances — this is an infrastructure configuration, not a code limitation.

Compute latency scales linearly with MC iterations: at 10,000 iterations (production default), p99 is 7.4s. Higher iteration counts for tighter convergence (100k, 1M as tested in Test 2.1) require proportionally more time per request and are handled as batch operations, not real-time queries.