Before we get ahead of ourselves, let’s establish a shared definition. In statistics, an interaction is defined as “…a situation in which the simultaneous influence of two variables on a third is not additive.”¹
In the context of digital experimentation, we can see interaction effects when multiple experiments target the same users. The combination of experiment variations may alter their behaviors in surprising ways that we couldn’t anticipate just from looking at results for each experiment independently.
Consider an example of two A/B tests on a single Shop Now button. One test modifies button text, and the other modifies button color. You configure these tests to be mutually-exclusive, which means that users who participate in one test cannot be targeted by the other.
After both tests have reached significance, you find that users who saw the “Discover” text variation clicked the button 2% more frequently, and users who saw the red color variation clicked the button 5% more frequently. You might assume that if you deploy the winning variations for both experiments, you will see an overall improvement of 7%. Because you are a careful experimenter, you decide to validate your assumption by testing the winning combination directly.
To your surprise, you find that visitors who are exposed to the combination of winning variations click the button 10% more frequently. Where is the “extra” 3% lift coming from? The answer is that you are seeing an interaction effect, where the performance of the combination of text and color is better than simply adding together the lift of both winning variations would lead you to believe.
Congratulations, you’ve just discovered a positive interaction effect!