What is A/A testing?
A/A testing uses A/B testing to test two identical versions of a page against each other. Typically, this is done to check that the tool being used to run the experiment is statistically fair. In an A/A test, the tool should report no difference in conversions between the control and variation, if the test is implemented correctly.
Why test identical pages?
In some cases, you may want to monitor on-page conversions where you are running the A/A test to track the number of conversions and determine the baseline conversion rate before beginning an A/B or multivariate test.
In most other cases, the A/A test is a method of double-checking the effectiveness and accuracy of the A/B testing software. You should look to see if the software reports that there is a statistically significant (>95% statistical significance) difference between the control and variation. If the software reports that there is a statistically significant difference, that’s a problem. You will want to check that the software is correctly implemented on your website or mobile app.
Things to keep in mind with A/A testing:
When running an A/A test, it’s important to note that finding a difference in conversion rate between identical test and control pages is always a possibility. This isn’t necessarily a poor reflection on the A/B testing platform, as there is always an element of randomness when it comes to testing.
When running any A/B test, keep in mind that the statistical significance of your results is a probability, not a certainty. Even a statistical significance level of 95% represents a 1 in 20 chance that the results you’re seeing are due to random chance. In most cases, your A/A test should report that the conversion improvement between the control and variation is statistically inconclusive—because the underlying truth is that there isn’t one to find.
How does A/A testing impact conversion rates?
Because no actual change is made to the different versions in the experiment, it should have no impact on conversion rates. However, the whole goal of running an A/A test could be to validate experimentation software (if you’re not running it to determine a baseline CVR). So if your A/A test does end up showing a (significant) difference in conversion rates, this could indicate an issue with the software or setup. Make sure to check all the targeting rules and documentation to prevent any false positives.
Like with any normal split test, an A/A test still has 2 versions, yet versus an A/B test, the B variant does not exist, we duplicate the A to create identical versions across both variants.
Should you add an A variant to an A/B test, creating an A/A/B test?
This is a common question, as one way to validate an A/B test could be to add a duplicate of the A variant to the experiment. Let’s start outlining what all these letters mean:
A/B test - 2 different versions in a single experiment, often referred to as a split test
A/A test - 2 of the same variant in a single experiment, used to validate setup or establish benchmark metrics
A/A/B test- Combining the above 2 tests into a single experiment. Testing both setup and a variant at the same time
A/B test results can be a highly impactful tool in increasing conversion rates (CRO). In order to do that accurately, you need to have a high confidence level in the tool and test setup. Adding a duplicate A variant to the test, can help validate results you’re seeing vs variant B, the last one, is reliable and avoids discrepancies. Great A/B testing tools have false positive prevention and functionality built in to detect issues in setup.
Preventing false positives in A/B testing tools, and why it’s important
Running experiment can be great for optimizing conversion rates or impacting other business critical metrics. But if you can’t rely on the software to accurately keep track of test results, this defeats the purpose of having testing software to begin with. The results need to be:
Trustworthy - can you trust that the test results are accurate and reflect reality.
Accurate - Making sure sample sizes are large enough and results are stable is key.
Significant results - Are the results for variant B meaningfully and consistently different from the A variant.
A/B testing and experimentation software, which allows you to run more than just A/B tests, are meant to give marketers trust in their test results. Running an A/A test tackles the first 2 of the aforementioned points so you know the third, significant results, are accurate and can be trusted.
How A/A test data can help your analytics tool and vise versa
Using an A/A test is a great way to measure your analytics setup. By running the same variant twice in the same experiment, it can give you a benchmark kpi to track against. The test data should show what your average conversion rate to beat is.
How does your analytics tool play into that? Your analytics tool, likely Google Analytics, should already be tracking your conversion rates. So if you’re running an A/A test to measure benchmark metics, shouldn’t those be (nearly) the same? Correct!
A/A testing is a common practice to validate tools against itself, but also against other vendors. If you already know your Google Analytics conversion rates are being accurately tracked, your A/A test should show (nearly) the same.
Help! My A/B test tools and analytics tools are showing different conversion rates after an A/A test
Make sure you run some common troubleshooting steps:
Check the sample size of your test. Although this test will never achieve statistical significance, because there is no real difference between the 2 variants to measure, it’s still important to run the test on a sizable number of visitors to validate it’s accuracy.
Check the targeting rules for both tools. Because most experimentation rules have to run at the top of the page head, or can be run server-side, and your analytics tool might run in something like Google Tag Manager, it could be that the rules on which pages to fire both tools could differ. Make sure to test and check setups and coverage across both.
Good minimum sample sizes for A/A tests
Although large sample sizes are not always needed for A/A tests, because you’re not actually changing anything in the variants, running a test with a large sample size can be easy. For instance, running an A/A test on the homepage is a great idea as for many website this is among the top visited pages and could quickly help identify any issues with your setup. Using a non-important landing page is also an option, but always take into account external factors. If traffic fluctuates on this page a lot, for instance because of paid budgets, it might not be the best page to run the test on. You’re looking for a page with stable conversion rates to benchmark against.
Optimizely Experiment stats engine and A/A testing:
When running an A/A test with Web or Feature Experimentation, in most cases, you can expect the results from the test to be inconclusive—meaning the conversion difference between identical variations will not reach statistical significance. In fact, the number of A/A tests showing inconclusive results will be at least as high as the significance threshold set in your Project Settings (90% by default).
In some cases, however, you might see that one variation is outperforming another or a winner is declared for one of your goals. The conclusive result of this experiment occurs purely by chance, and should happen in only 10% of cases, if you have set your significance threshold to 90%. If your significance threshold is higher (say 95%), your chances of encountering a conclusive A/A test is even less (5%).