The receipts

How accurate are we?

We predict pump.fun mint graduation in the first 30-60 seconds. Every prediction logged before the outcome. Resolved against on-chain truth (98.4% of labels). Three numbers, three different questions — read them together:

Graduates · on-chain verified

…

of in-band predictions that actually graduated

Forward, calibrated — …; every prediction hashed before its outcome

Live composite signal

…

3-tier ACT / WATCH / SCOUT — forward-validating

verdict publishes at the pre-registered sample floor — we don't publish a rate before it's earned

Sustains 30m post-bond (live)

…

of graduated mints held ≥80% of grad price 30 min later

n=… resolved post-bond outcomes

The three numbers measure different things and shouldn't be averaged. Graduates · on-chain verified asks "of forward predictions we made, how many actually graduated?" — every prediction hashed before its outcome, resolved against on-chain truth (not a backtest; the real receipts). Live composite signal is the current 3-tier product, deliberately pre-verdict: it is forward-validating under a pre-registered sample floor and we do not publish a rate before it is earned — the verdict (and the failure, if it fails) lands on the receipts trail. Sustains 30m post-bond asks "of mints that did graduate, how many held value 30 min later?" — sourced independently from on-chain DEX prices, the question a trader actually has, since graduation alone is not a profit thesis.

By confidence band

The model is honest about uncertainty. Lower-confidence calls graduate at lower rates — exactly as predicted. The Telegram bot only fires at ≥70%.

If we say	Actual graduation rate	Sample size
loading…

SYSTEM STATUS

CALIBRATED · STABLE

verify any prediction →

runner_prob fields — directional, recalibration pending

The runner_prob_2x/5x/10x_from_now fields exposed at /api/v1/probe and /api/live are currently directional: mints with higher runner_prob do hit higher rates, but the absolute probability is over-stated by roughly 11-13 percentage points on high-confidence bins (≥0.5). The saturation case (kNN reports 1.0 because all neighbors hit) is the loudest miss — runner_prob_5x_from_launch at predicted 0.99 has actual rate around 0.29.

Magnitude recalibration is in progress via the existing apply_calibration infrastructure (the same self-correcting curve grad_prob uses). Until recalibration is verified, treat runner_prob fields as a ranking signal, not as a literal probability. Consumers making sizing decisions on the absolute number should discount by ~12pp at the high end. The full audit (/api/scope documents the field, n=89,077 sample) is intentionally surfaced here rather than hidden — same discipline as the warming label on the live rate above.

raw JSON: /api/accuracy · NFA · DYOR · prediction model output, not financial advice