How many games before we can tell?
CI Explorer asks how uncertain one rate is. Two-Proportion Test asks whether two rates are different. Power Analysis asks the question that grounds both: if a real difference exists, how much data would it take to see it? The tool translates the statistical answer into tournament-calendar reality — a month of majors, a balance window, a full edition — so the reader can judge whether the question is even answerable on the timeline they care about.
The required-sample-size and minimum-detectable-effect numbers come from the standard two-proportion z-test formulas at the chosen significance and power levels. Both assume that every game in the dataset is an independent observation drawn from the same underlying win-rate distribution.
Tournament Swiss pairing breaks that assumption. Bracket structure, drop rates, side-balancing, and faction-vs-faction matchup distributions all introduce correlation that the simple formulas don't account for. As a rule of thumb, treat the numbers here as the floor — the real sample size needed to detect a given edge in real tournament data is somewhat larger, sometimes meaningfully so. The "Swiss Isn't Random" methodology piece (planned) will quantify this gap.
The cadence thresholds (600 games ≈ one MFM cycle; 2,400 games ≈ one full season) are calibrated to roughly one MFM-edition's worth of major-tournament data per faction at typical play rates, and one full year of activity respectively. Treat them as anchors, not bright lines.