The proof, with receipts

68% YES. We said 1%. It resolved NO.

If there's no human survey to check against, how do you know a synthetic panel reads reality at all? So we ran the one test we could. On the single topic where real-money markets exist on opinion — Trump approval — we asked whether our panel could read outcomes before contracts resolved. The answer: it's real. It's also not alpha. Both matter, and they're easy to confuse.

98

resolved contracts, pre-registered

70.4%

our direction accuracy vs. 72.4% market

63 / 98

cells where we were the closer estimate

0

settlement leak — odds-only, pre-resolution

The cell that started it

“Will Trump's approval hit 40% in 2025?” At decision time, Polymarket priced 68% YES. Our panel said ~1%. It resolved NO. That wasn't a fluke — the same pattern held across overpriced longshots. The market kept overweighting tail outcomes; the panel kept reading the underlying approval level and calling them out.

Where the market overpriced, the panel held the line

Six clean, pre-resolution examples. Market price vs. our forecast at the decision date — all resolved NO.

QuestionMarketLewsearchResolved
Trump approval hits 40% in 202568% YES~1% YESNO
Trump approval hits 35% in 202568% YES~1% YESNO
Approval between 47.0–47.4% on Apr 460% YES~1% YESNO
Approval hits 40% before August55% YES~1% YESNO
Approval hits 43% before August60% YES~7% YESNO
Approval between 38.0–38.4% on May 2952% YES~1% YESNO

The honest scoreboard, no spin

We froze every resolved Trump-approval contract we could map (n=98), ran a nationally weighted panel of 2,000 agents on each, and compared our forecast to the Polymarket mid-price at a pre-registered decision date — odds, not the eventual outcome. Nothing was scored against settlement.

MetricLewsearchPolymarketRead
Direction accuracy (overall, n=98)70.4%72.4%Near-tie, leans the market
Approval brackets (n=46)80.4%78.3%We edge ahead
Approval thresholds (n=32)68.8%78.1%Market edges ahead
Closer per-cell estimate (Brier)won 63 of 98We were closer on a majority of cells
When we disagreed with the market (n=26)right 46%A coin flip — not a trading edge

Per-cell calibration favors us 63% of the time, yet our aggregate Brier is worse (0.26 vs 0.16) — the classic signature of winning many small cells and eating a few confident misses. Interesting. Not bankable.

Can we forecast before resolution?

Mostly yes. ~70% direction, strong on brackets, and several large, clean longshot reads where the market was badly mispriced.

Do we beat Polymarket overall?

No. They edge us on direction. We win brackets, lose thresholds. Call it a near-tie that leans the market's way.

Could you trade on us?

No evidence of that. When we disagree with the market we're right about half the time. That's a coin flip, not alpha — and we ship no trading feature.

What the test actually proves

Not “use Lewsearch instead of Polymarket.” Not “fade the market when we disagree.” What it proves is the thing that matters for our customers: the panel engine isn't hallucinating on opinion-shaped questions. It tracks real-world political outcomes before they resolve, roughly as well as a market full of people betting real money. That's a credibility check — and it passed.

Your actual questions don't have markets. “Will this ad move suburban women in Ohio?” “How does Message A test against B with our segment?” That's the job. This was the proof.

Forecasts are AI-generated estimates, not interviews with human respondents and not probability samples. All 98 cells used pre-resolution market prices only; nothing was scored against settlement. Full write-up and underlying run are linked above.