First 11th-edition data is here — and the ELO ladder resets with it, every player back to a clean slate. The opening tournament results are in and the Strength Index is beginning to take shape across its signals, but a new edition takes time to read: expect weeks, likely months, before the numbers can support real conclusions. An early signal, not a verdict.

NEWCOMBE INTERVAL

TWO-PROPORTION TEST

Are these two rates actually different?

Two reported rates almost never match exactly. The question worth asking is whether the gap between them is large enough to outrun the noise. This tool builds a Newcombe 95% confidence interval around the difference and reports a plain-language verdict. It does not tell you which rate is “better” — only whether the data support calling them different at all.

CONTROLS

RATE EDITABLE LABEL

RATE 53%

SAMPLE SIZE 1,000

RATE EDITABLE LABEL

RATE 49%

SAMPLE SIZE 1,000

DIFFERENCE

95% INTERVAL FOR A − B

PER-RATE WILSON INTERVALS (CONTEXT)

TAKEAWAY

WHY NEWCOMBE, AND WHY NO P-VALUE?

The Newcombe hybrid score interval combines the two per-rate Wilson intervals into a confidence interval for their difference. It behaves well at small samples and at extreme proportions, where the simpler normal-approximation interval distorts. The verdict tiers above are a deliberate alternative to displaying a p-value — the three plain-language bands ("clearly different," "too close to call," "indistinguishable") communicate the actual practical conclusion without inviting the binary thinking that a single threshold encourages.