A/B test sample size calculator
Your control group's expected conversion rate.. [?]
The minimum relative change in conversion rate you would like to be able to detect.. [?]
95% is an accepted standard for statistical significance, although Optimizely allows you to set your own threshold for significance based on your risk tolerance.. [?]
Sample size per variation
Sample size calculator
Optimizely's sample size calculator is different from other statistical significance calculators. It is based on the formula used in Experimentation's stats engine. Stats engine calculates statistical significance using sequential testing and false discovery rate controls. When combined, these two techniques mean you no longer need to wait for a pre-set sample size to ensure the validity of your results. If Intelligence Cloud tells you that a result is 95% significant, you can make a decision with 95% confidence. Learn more
This statistical significance calculator allows you to calculate the sample size for each variation in your test you will need, on average, to measure the desired change in your conversion rate. In many cases, if Intelligence Cloud detects an effect larger than the one you are looking for, you will be able to end your test early.
Our A/B test sample size calculator is powered by the formula behind our new Stats Engine, which uses a two-tailed sequential likelihood ratio test with false discovery rate controls to calculate statistical significance.
With this methodology, you no longer need to use the sample size calculator to ensure the validity of your results. Instead, the A/B test calculator is best used as a tool for planning out your testing program to find out how long you may need to wait before Optimizely can determine whether your results are significant, depending on the effect you want to observe.
In traditional hypothesis testing, the MDE is essentially the sensitivity of your test. In other words, it is the smallest relative change in conversion rate you are interested in detecting. For example, if your baseline conversion rate is 20%, and you set an MDE of 10%, your test would detect any changes that move your conversion rate outside the absolute range of 18% to 22% (a 10% relative effect is a 2% absolute change in conversion rate in this example).
Decide how willing you are to trade off sensitivity of your test versus how long you might need to run your test for. The smaller the MDE, the more sensitive you are asking your test to be, and the larger sample size you will need.
Keep in mind that statistical significance in Intelligence Cloud's stats engine shows you the chance that your results will ever be significant, while the experiment is running. If the effect that our Stats Engine observes is larger than the minimum detectable effect you are looking for, your test may declare a winner or loser up to twice as fast as if you had to wait for your pre-set sample size. Given more time, stats engine may also find a smaller MDE than the one you expect. Learn more
Statistical power is essentially a measure of whether your test has adequate data to reach a conclusive result. Intelligence Cloud's stats engine runs tests that always achieve a power of one, meaning that the test always has adequate data to show you results that are valid at that moment, and will eventually detect a difference if there is one. This means that you can make a decision as soon as your results reach significance without worrying about power.