TheAlgoBrief
Back to Tools

A/B Test Calculator

Calculate statistical significance for your A/B tests

A/B Test Calculator
Enter your control and variant data to calculate statistical significance

Number of visitors in the control group

Number of conversions in the control group

Number of visitors in the variant group

Number of conversions in the variant group

Quick Examples:

Frequently Asked Questions

What confidence level should I use?

95% is the industry standard for most A/B tests. Use 99% for critical changes that could significantly impact revenue or user experience. Never go below 90% confidence.

How long should I run my test?

Run tests for at least 1-2 weeks to capture weekly patterns (weekday vs weekend behavior). For e-commerce, run through at least one full purchase cycle.

What if my test never reaches significance?

If after 4 weeks you haven't reached significance, the change likely has minimal impact. You can either accept the variant isn't better or test a more dramatic change.

What is Statistical Significance?

Statistical significance tells you whether the difference between your control and variant is likely due to a real effect or just random chance. A 95% confidence level (the industry standard) means there's only a 5% chance the results are due to random variation.

How Statistical Significance is Calculated

1. Calculate Conversion Rates

Control CR = Control Conversions / Control Visitors

Variant CR = Variant Conversions / Variant Visitors

2. Calculate Z-Score

The z-score measures how many standard deviations the variant is from the control

3. Determine Confidence Level

Z-score ≥ 1.96 = 95% confidence (statistically significant)
Z-score ≥ 2.576 = 99% confidence (highly significant)

Sample Size Guidelines

Minimum Sample

1,000+

Per variation for basic tests

Recommended Sample

5,000+

Per variation for reliable results

Small Changes

10,000+

Needed to detect small lifts (<5%)

High Confidence

20,000+

For critical business decisions

Common A/B Testing Mistakes

1. Stopping Tests Too Early

Wait for statistical significance AND run for at least 1-2 full business cycles

2. Testing Too Many Variations

More variations = more traffic needed. Stick to 2-3 variations max

3. Ignoring External Factors

Seasonality, marketing campaigns, and holidays can skew results

4. Testing Multiple Changes

Test one change at a time to know what caused the difference

5. Not Accounting for Novelty Effect

Initial lift may fade as users get used to the change. Run tests for 2+ weeks

When to Stop Your Test

  • Reached Significance: 95%+ confidence with sufficient sample size
  • Completed Full Cycle: At least 1-2 weeks to account for weekly patterns
  • No Movement: If after 4+ weeks there's no trend toward significance, stop
  • Negative Impact: If variant is significantly worse, stop immediately

Track A/B Tests Automatically

Stop calculating test results manually. Automatically track A/B test performance across all platforms and get real-time significance updates.