Calibration & Track Record
Reliability, Brier decomposition, track record, and AI-vs-benchmark performance
Window
Category
Platform
Why the VNX ensemble — selective by design
Overall Brier◎
Lower is better
Reliability◉
↓ lower is better
Resolution◉
↑ higher is better
Mean CLV✦
Signal-time proxy
Predictions▤
Resolved
Brier vs Market — Pooled (Methodology / Transparency)
Transparency, not the headline. This pools every prediction — including the hedged band where the ensemble deliberately abstains — so it understates directional edge. The confident-band read is the value story (see top of page).
By Confidence — where the AI commits vs hedges
By Horizon — calibration across resolution timescales
Platform Calibration — AIA Forecaster Reliability
Brier Decomposition (Murphy)
Brier decomposition requires predictions spanning multiple probability bins across resolved markets. Data accumulating from AIA Forecaster resolved predictions.
Wealth-Replay — third-Kelly bankroll over the calibration window
Le 4-Component Decomp — where the residual variance lives
Skill vs Luck — bootstrap decomposition of track-record edge