Skip to main content
← Risk Engine Validation & Stress Testing Report

Data Integrity Evidence

Multi-layer defense against stale, missing, or imprecise data in production. Supports Tests 9.1 and 9.2. For term definitions, see the Glossary. For data sources, see Data Provenance.

Three-Layer Defense

The platform prevents stale data from producing unqualified risk scores through three layers that operate at different points in the data pipeline (Data Provenance).

LayerMechanismDetection PointOutcome
1 — Cross-source validationInvariant checks cross-verify data across multiple sourcesPer-snapshot, during ETLDivergence persisted to data warehouse; pipeline halts if threshold exceeded
2 — Pipeline freshness monitoringTracks table age against current timestampScheduled monitoringTable flagged STALE with exact age in minutes
3 — Oracle architectural scoringDeductions for contracts lacking staleness validationOracle assessmentScore capped below Reference-grade

Layer 1 — Cross-Source Validation

Every pool snapshot runs invariant checks that cross-verify data from multiple sources. Results are persisted as structured JSON in each snapshot record — not transient logs, but first-class auditable fields.

The system detects three categories of data integrity failures:

  • Missing or unavailable prices — when a primary price source returns zero, null, or stale values, the system detects the failure, switches to an alternative source (e.g., CoinGecko), and records the fallback in the snapshot. The fallback is auditable.
  • Position-level divergence — position totals are cross-checked against aggregates reported by the protocol. Divergence beyond configured thresholds halts the pipeline.
  • Market-level divergence — supply, borrow, and total USD values are cross-verified across data sources. Discrepancies are recorded and, if material, prevent score generation.

When any divergence exceeds its configured threshold, the pipeline fails and does not produce a score. The system will not generate a risk assessment from data that violates its invariants.

Layer 2 — Pipeline Freshness Monitoring

A monitoring service compares the latest data timestamp against the current time for every materialized table.

StatusConditionEffect
SYNCHRONIZEDData age within thresholdNormal operation
STALEData age exceeds thresholdFlagged with exact age in minutes
MISSING_DATAMissing rows exceed percentage thresholdFlagged for investigation

Layer 3 — Oracle Architectural Scoring

Oracle scores include deductions for contracts that lack staleness validation in their on-chain implementation.

DeductionPenaltyWhat It Means
No staleness check-0.05Oracle adapter does not verify that the price data is recent
Staleness passthrough-0.10Stale prices from underlying feed pass through unchecked

An oracle that cannot detect its own staleness is structurally penalized, contributing to a score below Reference-grade (0.9-1.0).

Production Track Record

The pipeline has been running continuously in production since January 2026 without a single unrecoverable failure. Throughout this period, the cross-source validation layer has been active on every snapshot — detecting, recording, and where necessary halting execution when data quality invariants were violated. Stale or missing price events were caught and handled without manual intervention.

How the Layers Interact

Layer 3 identifies oracles structurally vulnerable to staleness. Layer 1 detects when stale or missing data actually enters the pipeline. Layer 2 detects when data stops arriving entirely. Together, they ensure that stale data cannot produce an unqualified risk score.

Precision & Schema Normalization (Test 9.2)

Raw on-chain values are transformed into a normalized schema using precision-preserving types at every stage of the pipeline (Data Provenance).

Type Mapping

On-Chain TypeDomain LayerBigQuery TypeRationale
uint256 (shares, raw amounts)DecimalSTRINGLossless storage — no truncation at any token decimal count
uint128 (rates, ratios)DecimalBIGNUMERIC38-digit precision — exceeds uint128 range (39 digits)
Computed USD valuesDecimalBIGNUMERICPreserves sub-cent precision for institutional metrics
Percentages (APY, utilization)DecimalBIGNUMERICNo floating-point rounding in rate calculations

The domain layer uses Python Decimal throughout — no floating-point arithmetic touches financial values between RPC ingestion and BigQuery persistence.

Cross-Verification Against Protocol Sources

The same cross-source validation layer described in Test 9.1 serves as the precision audit mechanism. Every snapshot cross-verifies computed values against protocol-reported aggregates:

  • Position totals are summed from individual on-chain positions and compared against the protocol's own aggregate endpoints. Divergence is persisted with the exact ratio and absolute USD amount.
  • USD valuations are independently computed from raw token amounts and external price feeds, then cross-checked against protocol-reported USD totals.

When the computed value diverges from the protocol's reported value beyond the configured threshold, the pipeline halts. This guarantees that no transformation step silently loses precision — if it did, the divergence check would catch it.

Schema Evidence

Production schema from lending_pool_state_mt confirms the type mapping is enforced at the storage layer:

ColumnTypePurpose
total_sharesSTRINGuint256 share totals — lossless
shares (position tables)STRINGuint256 individual positions — lossless
utilization_rateBIGNUMERICComputed rate — 38-digit precision
supply_apyBIGNUMERICProtocol rate — 38-digit precision
collateral_ratioBIGNUMERICComputed ratio — 38-digit precision
position_value_usdBIGNUMERICUSD valuation — sub-cent precision