July 16, 2015

How The Next Web Scaled Their A/B Testing Output 100x

About two years ago we started intensifying our A/B testing program here at The Next Web, one of the world’s largest online publications for news about Internet technology, business, and culture. We’ve found A/B testing to be a very powerful way to increase reader engagement — our primary website goal. In this blog post I’d like to dive into some aspects of the testing program at The Next Web to provide you with some ideas to incorporate into your own testing program.

Much like these trees, we have achieved huge scale in our testing program over the past 3 years.

This is a guest blog post from Martijn Scheijbeler, the SEO, analytics and growth lead at The Next Web.

About two years ago we started intensifying our A/B testing program here at The Next Web, one of the world’s largest online publications for news about Internet technology, business, and culture. We’ve found A/B testing to be a very powerful way to increase reader engagement — our primary website goal. We see engagement as the first step in creating a relationship with readers which, in the long-run, increases the likelihood they will purchase our courses, products, and event tickets. Over the past years, we have learned quite a lot on how to engage more readers while also improving our process and setup of our testing program.

In this blog post I’d like to dive into some aspects of the testing program at The Next Web to provide you with some ideas to incorporate into your own testing program. Over the years, I have talked to a number of companies — enterprises and SMBs in media and non-media industries — about their testing programs. In the end, we all seem to agree that focusing conversion rate optimization efforts on experiments to increase engagement is an effective way to increase revenue.

The decision to invest more in testing

I joined The Next Web 2.5 years ago as an SEO specialist, but after a few months, I took on the responsibility to set up our web analytics and testing program.

When we first started with A/B testing, our set up was simple: Optimizely was our main testing tool where we both set up our tests as analyzed the results. We ran 1-2 tests per month that focused on three things:

1) improving user engagement

2) click through rates on “call to actions” (CTAs) on specific projects like TNW Academy

3) increasing time spent on the site, resulting in more ad impressions

I ‘designed’, coded and analysed every test which, in hindsight, was not the best position for me to be in. We knew we had to scale our testing program and based on that we decided to hire a web analyst who would be fully responsible for running testing.

Optimization tools used before and after scaling:

Before	After
Optimizely	Optimizely Google Tag Manager Google Analytics In-house heat map tool/click tracking

After scaling our efforts, we now have multiple tools in our optimization stack which aid us in testing across multiple projects/code bases, collecting data and analyzing results. By implementing these tools, we are able to increase the number of tests we run from 1-2 per month to 5-6 per week which means we can run about 100x more tests with these improvements. But how exactly did we make these advancements to our testing program?

Ideas for testing and our backlog

Since we still see ourselves as a startup, our team is pretty small and everyone on our team is involved in our testing.

Internal testing ideas tool at The Next Web

A screenshot of the custom internal tool we built to help organize our A/B test ideas.

On a weekly basis, our CEO asks me to make tests even bolder because he’s curious what will happen. This makes my job almost too easy ;-). With our CEO, our web analyst and I discuss hypotheses such as how we can make our related stories even more visible on the page to increase time users spend on The Next Web.

Our community team is directly involved with regularly testing the sharing buttons on articles. Frequent experiments include different CTAs, orders of buttons, etc.

We currently have a backlog of over 50 ideas contributed from many different parts of the team. Since we move fast these days, it’s important to prioritize ideas to ensure that we’re working on tests that have the highest possible impact to reach our targets faster.

Tip: Always make sure to record the test ideas you and the rest of your team generate.

Testing across multiple projects/code bases

We’re very lucky to have a development team who supports our testing program. Currently, we run our projects on nine different platforms/codebases. Optimizely runs on many of those projects ranging from, TNW Deals to TNW Academy. Our main blog is powered by our in-house solution so that we can fully benefit from the number of monthly unique visitors. For reference, The Next Web receives about 7.5 million monthly visits and 11 million monthly page views. Combining our in-house solution with Optimizely enables us to test across all projects and push all the tests live very easily.

Because we test so many properties and codebases, we finally acknowledged we couldn’t handle keeping all the codebases up to date and decided to start using Google Tag Manager with Optimizely + custom JavaScript.

Tip: Make sure that your development team is supporting your testing program so you can easily set up new tests and deploy updated versions to your live servers.

Collecting user data and metrics

Our main web analytics tool is Google Analytics. We use it to collect all of our data and set goals so we can monitor revenue related KPIs and ticket sales to our events. We also use Google Analytics to set various goals for engagement, like improving clicks on certain areas of the site like ‘related stories’, share buttons, etc. These are the most important objectives for most publishers since that’s how we drive engagement and increase viewable ad impressions.

Heat map

Most existing tools make it very expensive at our size (monthly 15-20 million click events) to collect click data on templates. That’s why we decided on using an in-house solution. A few months ago we finally set up a tool that could analyze click behavior in different experiments and variants. You can read more about the setup here. The collected data is available at request from our Business Intelligence servers making it easy for us to set up the segmentations per URL, device and browser version.

We’re using heat maps to make sure we also see the interactions with our platform on items that are obviously not clickable but also attract interest to users. Heat maps drive new ideas for testing but also provide insights into what our users see on the page.

Automating

Ultimately we found that analyzing our test results takes the most time. It’s easy to find what variant works better than another, but we also want to look at different segments of our users; for example, new vs. returning, and how certain tests influence these different users. By doing this we’re able to see which tests we want to permanently launch on what users. I gave some more insights into why we set up our own platform on Optiverse.

In conclusion

By backlogging our ideas, gaining support from our development team and automating the collection of data, we were able to run more tests and increase our efficiency dramatically!

I’m impressed with all the improvements we already made in our testing program so far and I’m looking forward to the changes we’ll make in the upcoming year as we scale our testing efforts even more. For now I’d love to hear the improvements you made in the setup of your own testing programs, so leave them in the comments below!

How The Next Web Scaled Their A/B Testing Output 100x

Martijn Scheijbeler