The value of experimentation for product teams


Hi, I'm Griffin Cox, Senior Technical Product manager at Optimizely. I manage our Feature Experimentation product, which is our full stack experimentation and feature management solution. I used to be an engineer and now I'm a product manager, and so I've seen both sides of the world of product development with feature flags and products like Optimizely and without, and let me tell you, it's a much brighter world with products like Feature Experimentation.

On the development side of things, I've built features that came from product. We only had one version of it. We built that version, we deployed that version, and then the next day they came to us asking us to modify it. The first time a product manager or product team comes to an engineer and says, Hey, can you change this text to this text or this number to this number? It's okay, but the fifth time they want to play around with it, it starts to get annoying.

Enable product teams

And so with a product like Optimizely, particularly with our feature flagging features, you're able to make those kinds of changes without needing to even speak to a developer. So the developers really can focus on the harder problems, the problems they really want to solve, not just changing text and numbers. It also really enables a new kind of collaboration with my product team. Rather than them coming to me and asking me to build a single feature, they would ask me to build a prototype of a feature, essentially a template for a feature. So if we're talking about a slider that maybe has a headline, a minimum, a maximum, some buttons underneath, and a call to action, we would work together to identify parameters or variables that really could power the feature and enable my product team to play with those values without needing to loop me in.

It also meant that if something went wrong on hours, let's say at midnight, I wouldn't need to write a pull request, get one of my colleagues online to approve it, go through our entire software development life cycle, get it out, and essentially have a bug in production for an hour or more. It meant that I could just flip a switch and turn that Feature off. It also meant that my product team could do that, and I wouldn't even need to be woken up at all. This sort of collaboration enabled my product team to experiment on their own.

Here at Optimizely, we recommend a modern development workflow that really begins with the product team. The product team basically spells out what parts of the Feature they would ever want to experiment on and what metrics they'd want to collect down the road. Basically, the more the better.

From there they're sort of creating a flag spec or specification that can be handed off in story form to an engineer. That engineer takes, creates a feature flag in the Optimizely UI, adds variables to that feature flag, each one representing an aspect of the feature that the product development team or whomever else may want to experiment on. So for example, with a slider, let's say the slider has a headline, has a minimum, a maximum, and a call to action. Those are kind of like the dimensions of the feature that the product team may someday want to experiment on, and maybe they want to measure how many people click the call to action. They want to measure revenue downstream and a few other metrics. All of these kind of go into a story that's handed off to a developer, and that developer then builds the feature flag.

Benefits of Feature Flags

They implement it in the code, they test the flag as it is, but not as an experiment. We're not at the experiments yet. So with the flag in place, and when the flags boundaries are tested, that's really the end of the developer responsibility. The product team can then go in and add an experiment, add a targeted delivery, roll out that Feature to a specific audience or a fine grain percentage of their audience, or run as many experiments as they'd like using the feature flag With those feature variables. Developers naturally take to feature flags because it's a concept that they may be using in their day-to-day anyway. The most basic version of a feature flag is essentially a constant in the code. It can be set to enabled or disabled, for example, in changing that value could enable or disable a whole feature. The problem is when you start to release those features in production, to turn them on or off, you'd need to go through the entire software development life cycle and release a whole new pull request to make that change happen.

And that can take a lot of time. It's also prone to errors which can slow down your median time to recovery. So with a feature flag such as an Optimizely Feature flag, you're able to essentially remote control your code, unlocking at least three really exciting capabilities for developers. So the first one is the ability to really de-risk your deployments. So if you're not really sure that that feature's going to work, or even if you are, you can release that feature to production just for selected audiences or just to say 1% of your traffic to really limit the blast radius of that feature. If there is a bug in, let's face it, there's usually a bug. The biggest value add here is making alignment between development and product teams less necessary. By putting a feature flag in place, developers can deploy when their testing standards are met, when they're ready, when their software development life cycle is complete, and a product team can then turn it on their schedule, which could be the next day, the next week, or even the next month.

By decoupling the two processes, deployment and release, you're able to reduce the alignment necessary between developers and product teams, really speeding up your developer velocity. As a product manager, it really starts by admitting that, I don't know what I don't know, but one thing I do know is that customers know best. What may work for some customers of mine may not work for others, and what features and configurations may work for, let's say a competitor may not work at my company, and so I test everything and using an Optimizely Feature flag in Feature Experimentation, I'm able to test anything front end to backend algorithms to button colors. Everything gets tested and every result gets quantified. That way I can make data informed decisions about my business. Optimizely's Feature Experimentation helps me avoid common pitfalls in the product development process. First, it helps me avoid not measuring impact, which unfortunately is too common.

In our industry. There's this emphasis on delivery, delivery, delivery, and not delivery, measurement, thinking, delivery, measurement, thinking. And with this product, I'm able to quantify everything that my team does to help me inform the next steps and indeed the future of the product. Not only does Feature Experimentation help me avoid not measuring impact, but it also helps me avoid wasting effort. It still blows my mind to this day that we often in the field of tech release, one version of a feature to our entire audience, sometimes all at once without very much testing in production. I've already kind of talked about the benefits that feature flagging and Optimizely Feature Experimentation bring to that process. But I want to talk to you now about how Feature Experimentation can help you extract more value from each feature. It starts with the insight that features perform differently for different audiences.

Personalize features

With Optimizely Feature Experimentation, you can unlock what we call Feature Personalization. Now, that goes beyond your standard Personalization, which is usually just on content and images. When we say Feature Experimentation, we ultimately mean customizing, tailoring your features to the audiences they perform for. For example, let's say you're testing two designs, and let's say you have two audiences, a younger audience and an older audience. In traditional AB testing, you would test variation A versus variation B. For both audiences combined, let's say variation B was the winner, you would then roll it out for overall traffic, and you'd be done. Now that's better than nothing, but wouldn't it be better if you could test them separately and roll out potentially different winning variants to each audience, essentially giving customers exactly what they want? So in that example, what it would look like is running an experiment on the f  eature flag for your younger audience, and then running an experiment for your older audience. Let's say both with the same variance and let's say the younger audience preferred version A, and then let's say the older audience preferred version B, being able to roll out version A to that younger audience and version B to that older audience at the same time. That's what we call Feature Personalization. When you're ready to move beyond website UI testing, perhaps towards mobile app testing or testing anywhere near tech stack at high performance, then you're ready for Optimizely Feature Experimentation. Check it out at