Download our FREE Testing Toolkit for A/B testing ideas, planning worksheets, presentation templates, and more!Get It Now
Statistical significance is the likelihood that the difference in conversion rates between a given variation and the baseline is not due to random chance. A result of an experiment is said to have statistical significance, or be statistically significant, if it is likely not caused by chance for a given statistical significance level.
Your statistical significance level reflects your risk tolerance and confidence level. For example, if you run an A/B testing experiment with a significance level of 95%, this means that if you determine a winner, you can be 95% confident that the observed results are real and not an error caused by randomness. It also means that there is a 5% chance that you could be wrong.
Statistical significance is important because it gives you confidence that the changes you make to your website or app actually have a positive impact on your conversion rate and other metrics. Your metrics and numbers can fluctuate wildly from day to day, and statistical analysis provides a sound mathematical foundation for making business decisions.
There are two key variables that go into determining statistical significance: sample size and effect size.
Sample size refers to how large the sample for your experiment is. The larger your sample size, the more confident you can be in the result of the experiment (assuming that it is a randomized sample). If you are running tests on a website, the more traffic your site receives, the sooner you will have enough data to determine if there is a statistically significant result.
The second factor is effect size. If there is a small effect size (say a 0.1% increase in conversion rate) you will need a very large sample size to determine whether that difference is significant or just due to chance. However, if you observe a very large effect on your numbers, you will be able to validate it with a smaller sample size to a higher degree of confidence.
Beyond these two factors, a key thing to keep in mind is the importance of randomized sampling. If traffic to a website is split evenly between two pages but the sampling isn’t random, it can introduce errors due differences in behavior of the sampled population.
For example, if 100 people visit a website and all the men are shown one version of a page and all the women are shown a different version, then a comparison between the two is not possible, even if the traffic is split 50-50, because the difference in demographics could introduce variations in the data. A truly random sample is needed to determine that the result of the experiment is statistically significant.
Calculating statistical significance accurately can be a complicated task that requires a solid understanding of statistics and calculus.
Fortunately, you can easily determine the statistical significance of experiments, without any math, using Stats Engine, the advanced statistical model built-in to Optimizely.
Stats Engine operates by combining sequential testing and false discovery rate control signs to deliver statistically significant results regardless of sample size. Updating in real time, Stats Engine will ensure a 95% significance level results every time, boosting your confidence in making the right decision for your company and to avoid pitfalls along the way.
To address these common problems, Stats Engine was created to test more in less time. By helping you make statistically sound decisions in real time, Stats Engine adjusts values as needed and shares trustworthy results quickly and accurately.
Start running your tests with Optimizely today and be confident in your decisions.