Tools · Decision-first
A/B Test Calculator
Two numbers most dashboards won’t give you straight: is B actually better than A, and how sure can you be? Enter the results to get lift, a p-value, and a verdict. Then size the next test before you launch it.
Did B beat A?
—
—
The test: a two-proportion two-tailed z-test. The p-value is the chance of seeing a gap this large if A and B were truly equal; below 0.05 is the usual bar for “real.” The interval is the 95% range for the true difference in rates. It assumes a fixed sample decided in advance — peeking at a live test and stopping when it looks good inflates false positives.
How many do I need?
—
Standard two-proportion power calculation at 95% significance. Halve nothing — that count is per variant, so a two-arm test needs roughly double. Smaller effects cost dramatically more traffic; that trade is the whole planning conversation.
Measurement before celebration. Knowing whether a number is real — and how much traffic it takes to find out — is the same discipline I bring to an AI feature in production.