Does Runir publish its track record?

Yes. Every wall is graded the next session against what spot actually did — held, pinned, rejected, or broke — and the held-over-tested rate ships with its sample size and a Wilson 95% confidence band. Most options services publish no verifiable record at all.

What does the published hit rate actually mean?

It is a historical base rate, not a forecast — how often walls of a given type held in the past. It informs sizing across many decisions; it does not promise what any single next wall will do.

Why we publish our hit rate (and almost nobody does)

TL;DR

Almost no options service publishes a verifiable track record. Runir does, because levels published without one are unauditable claims. The rate ships with sample size, opens on the worst recent miss (not the best hit), and is recomputed every session — every grade traces back to raw measurements anyone can re-derive.

Pick a random options service. Find their track record. Most of them don't have one published. The ones that do tend to pick favorable windows, cherry-pick wins, or use definitions of “hit” that drift quietly in their favor. That's not an accident — an unverifiable track record is a marketing asset; a verifiable one is a liability the moment the next call goes wrong.

The discipline of publishing a hit rate — defined the same way every time, computed mechanically, with sample size always attached — is the single accountability move that separates an options research product from a marketing surface. That's the bet Runir runs on. If we don't hold the rate up, none of the rest is worth the bytes.

What “hit rate” actually means here

Every wall Runir publishes gets graded the next completed session into one of four classes — untested, rejected, pinned, or broke. The full taxonomy and the reason each class exists are covered in the pinning vs breaking piece.

The published hit rate is held over tested — walls that were actually challenged AND held, divided by walls that were challenged at all. Untested walls don't count toward either side, so a service that grades walls price never touched can't inflate its headline by counting them as wins. (Plenty of competing services do exactly that.) The denominator matters as much as the numerator; cutting corners on the denominator is the most common way a real-looking number ends up meaning nothing.

As of the most recent score-out, 5,349 walls have been graded across the Runir universe. The live held-rate, the sample size for the most recent rolling window, and the Wilson 95% confidence band all live on the homepage receipts hero — refreshed every session.

Why the post opens on the worst recent miss

The Friday Accountability Card — Runir's weekly receipt — opens on the most recently broken wall, not the best hit of the week. That's a deliberate inversion of the industry default. The pattern most options services follow is to lead with their best call, bury the misses in a footnote, and let the reader infer a track record from the highlight reel.

Opening on the miss does three things at once. It signals to the reader that the framework expects misses (frequencies aren't certainties); it reverses survivorship bias inside our own publication; and it makes the rest of the card meaningful — if we were hiding misses, why would we lead with one? The discipline is editorial as much as analytical. A research product that can't open on its own failures is a research product that hasn't made peace with what frequencies actually are.

The same discipline runs the per-name pages and the rate itself. When a wall breaks, we publish the break, score it, and let it count against the universe-wide rate. There's no asterisk for “regime change,” no “we called this one but the market was wrong.” A miss is a miss. The classifier doesn't take a side; the rate doesn't round in our favor.

Why sample size always ships with the rate

A 75% hit rate at n=8 is statistically indistinguishable from a coin flip. The same rate at n=300 is a signal you can size around. Reporting the rate without the sample size is one of the most common ways a number that should mean something gets used to mean nothing.

Every Runir surface that shows a rate also shows the sample. The receipts hero on the homepage renders the running held-rate as a line with a Wilson 95% confidence band — the band tightens as the sample grows. A wide band early in the data means “this rate is consistent with anything from X% to Y%; wait for more data before treating it as a fact.” A narrow band late in the data means “the rate has settled; treat it as a base rate, with the usual caveats.”

The Wilson interval is the standard textbook bound for a binomial proportion under small samples — the same math statisticians have used for a century. The four-class taxonomy, the canonical constants used to derive each grade, and the changelog of every math audit that has refined the formulas over time all live on the methodology page. Nothing is proprietary. Anyone willing to do the math can re-derive every number from the raw measurements.

What it doesn't tell you

The hit rate is a historical measurement, not a forward forecast. It says how often walls of this class, in this regime, held in the past. It doesn't say what the next wall will do. A 73% historical held-rate doesn't mean any individual wall has a 73% chance of holding — base rates are inputs to decision-sizing across many trades, not single-trade predictions. The framework gives you a frequency, not a guarantee; sizing a trade against the frequency is the trader's job, and nobody can do that step for you and remain honest.

The rate is also computed over a finite past, with a finite universe, in a finite market regime. If the volatility environment shifts dramatically — a sustained vol spike like 2020, a Fed-shock cycle, a structural change in how options market-making is run — historical rates may not generalize cleanly forward. The Wilson band gives statistical bounds; it does not bound a regime change. That risk is on the reader, not the rate.

What the rate does give you is a verifiable, auditable, sample-sized claim — instead of a service's word. That's the floor we work above. Whether the rate is high enough to size trades around is a separate question, answered by each reader matching it against their own risk tolerance.

Where to see it live

The homepage receipts hero shows the cumulative held-rate equity curve with the Wilson 95% CI band — every session adds a new point as fresh wall grades land. The methodology page has the math behind the curve, the canonical constants the classifier uses to derive each grade, and the methodology changelog of every math audit that has refined the formulas over time. (Every audit that moves the rate is published with the diff — the same transparency discipline as the rate itself.)

Read the pinning vs breaking piece for the four-class grading taxonomy underneath every number. Then open any per-ticker page — NVDA's, say — and the per-name reliability section appears once enough sessions have been graded for that ticker to publish a stable rate. Per-name rates only show once the sample is meaningful; the cutoff itself is published on the methodology page.

Common questions

Does Runir publish its track record?: Yes. Every wall is graded the next session against what spot actually did — held, pinned, rejected, or broke — and the held-over-tested rate ships with its sample size and a Wilson 95% confidence band. Most options services publish no verifiable record at all.
What does the published hit rate actually mean?: It is a historical base rate, not a forecast — how often walls of a given type held in the past. It informs sizing across many decisions; it does not promise what any single next wall will do.