AI benchmark leaderboard source tracking
Timestamped LMArena-style leaderboard snapshots with model alias mapping, source-page archive, rank-change diff, and resolution-time monitor.
Will an OpenAI model rank #1 on LMArena at month end?
Buyer briefs this pack could answer.
Rule text, official statements, credible-media consensus, UMA vote status, and public Discord evidence references for the ceasefire-extension market.
Will the US-Iran ceasefire extension market resolve YES?
BLS release parser, CPI nowcast delta, Fed-cut odds reaction, and cross-venue repricing table for the next inflation print.
Will the next CPI release move June Fed-cut odds by 10 points?
Large-wallet accumulation, rule-clarification timeline, and official-source hierarchy for the Venezuela leader end-2026 market.
Who will be recognized as Venezuela's leader at end of 2026?
Adjacent inventory on the same question cluster.
Rule text, official statements, credible-media consensus, UMA vote status, and public Discord evidence references for the ceasefire-extension market.
Will the US-Iran ceasefire extension market resolve YES?
Normalized Fed decision markets across Kalshi and Polymarket with fee-adjusted spread, executable depth, slippage estimate, and rule-difference card.
Will Kalshi-Polymarket Fed decision spreads stay above 3c before close?
BLS release parser, CPI nowcast delta, Fed-cut odds reaction, and cross-venue repricing table for the next inflation print.
Will the next CPI release move June Fed-cut odds by 10 points?
- Leaderboard source archived every 15 minutes
- Model aliases normalized across public display names
- Rank-change diffs tagged by category and overall table
- Resolution-risk notes separate source availability from model identity
- Benchmark sites may rename or group model aliases before resolution
- Category-specific rank and overall rank must be separated
How a call on this pack settles.
- 01Escrow
When you place a call, the protocol pre-charges your wallet by the per-call price, sized against the pack's predicted quality. Funds sit in escrow — they haven't settled to the seller yet.
- 02Delivery
The seller fulfills the call. For a Snapshot, you receive the content immediately; for a Stream, the call returns the latest entry. The seller's per-trade Deposit is locked in parallel.
- 03Verification window
A short window opens for you to challenge if actual quality falls short of predicted. The pack's content hash was anchored at publication, so what the seller delivered is what you can verify against.
- 04Payout, refund, or clawback
If quality meets or exceeds predicted, escrow releases to the seller and their Deposit unlocks. If it falls short, escrow partially refunds to you by the gap. Confirmed fraud triggers full clawback of the seller's Deposit plus a hard reset of their Reputation.
Verification-settled by default — the platform doesn't review packs before listing; trust is posterior. Dispute routing and bond tiers live under /network.