Feature flags (aka feature toggles) are quickly becoming the standard implementation for strong engineering organizations. Feature flags enable teams to deploy and modify features without changing code and easily perform A/B experimentation on these features in production to ensure customers have best user experience.
However, as more engineering organizations scale their feature flag utilization, it’s important to discuss the fact that feature flags are not all used for the same purposes and treating all feature flags the same way is risky.
In tandem with determining the different types of feature flags, it’s important to address the key governance that needs to be in place to manage feature flags at scale. Governance can be defined as the establishment of policies around the how/what/why/when of your feature flag implementation. Without standardizing these policies in your organization, proper implementation and monitoring will be impossible, which could jeopardize your feature flag journey resulting in some catastrophic consequences.
When approaching governance around feature flags, I’ve seen three key areas that need consideration: limiting feature flag lifetimes to minimize technical debt, understanding who on your team should have access to modifying feature flags and a crisp ownership model for how feature flags are deployed.
Limit Feature Flag Lifetimes to Minimize Technical Debt
As you scale your use of feature flags, you may have some long term flags that you need to have in your system for a long duration. In my experience, however, never removing any of your flags is dangerous. Persistent feature flags that gate permanent features introduce technical debt. Too many “permanent” feature flags, and your code base suddenly becomes:
- harder to understand: what’s under the hood of this feature?
- harder to test: which code path is actually being executed?
- more fragile: what if another piece of work disables your flag?
- harder to support: what happens when institutional knowledge is lost?
Defining Access Controls for Flags
Feature flags give you the capability to test a new feature quickly in production, and if issues arise, to quickly roll back. If a developer rolls out a feature on Friday night and that flag causes a customer-facing issue on Sunday morning, the on-call engineer can easily go in and flip it off without having to page an entire team to come in to debate rolling the build back or fixing forward. But not all developers should be free to roll back any feature flag. What about feature flags that gate a feature that’s critical to a customer? Removing this flag can be very damaging to the business.
Define, Document and Communicate Ownership Model
For feature flags to function well, crisp ownership over who implements, rolls out, and maintains each flag must be absolutely clear.
Here are a couple of common models:
- In the most common cases, the development team owns implementation and maintenance; the product team owns communication and controls of the rollout.
- When feature flags act as circuit breakers, DevOps or SRE may own both maintenance and communication instead.
Develop a clear ownership model for each type of flag — and document it in an accessible place to avoid inefficiency and confusion.
Understanding Feature Flag Types
So, what are some types of Feature Flags? Martin Fowler’s great blog post, “Feature Toggles (aka Feature Flags),” breaks down four core styles of Feature Flag Types (Release Toggle, Ops Toggle, Permission Toggle Experiment Toggle). Asa Schachar’s book, Ship Confidently with Progessive Delivery and Experimentation, goes deeper into specific types (Permission, Operational, Configuration-based, product re-brands, large-scale refactors, etc) and articulates long-term vs. short-term governance strategies.
These are helpful resources, however, your organization may not need to use every type of flag.
At Optimizely, we leveraged these resources when evaluating our organizational needs for feature flags and we have defined the flags we use into six different types. We came up with these six distinct flag types based on the idea that each requires its own governance process.
Feature rollouts are flags that gate new, customer-facing features. The code that’s flagged is meant to be permanent, so we expect to remove the flags once the feature is fully rolled in our production environment. Of all our flag types, feature rollouts and bug fixes have the lowest level of access control; all developers have visibility and control over them.
Bug fixes are more complicated code changes that we want to slowly roll out. Developers also use bug fix feature flags to try quicker bug fixes, and testing in production behind the flag. Bug fix feature flags are always meant to be removed once the fix is good. These have the highest visibility since we need to be sensitive that we are correctly fixing our incidents and not making them worse. All developers have visibility and control over these.
Operational and Circuit Breaker
Operational flags control any operational aspect of our systems behavior. The circuit breaker flag is a manual shut off switch that disables some functionality. These are both long term flags and we expect these to be in our code base for a while. These generally have a higher level of access control as they are controlling some behavior of our system that may be sensitive. The functionality of these are owned by their respective teams and require clear documentation of how the flags affect the systems that they are implemented in.
Permission flags determine which sets of customers have access to certain features. These gate our feature tiers, as determined by the pricing and packaging for Optimizely. Permission flags have the highest access control as only members of our monetization team are allowed to change these. At Optimizely, these controls are implemented by our development team, but the actual control flow is managed by a business team.
Experiment feature flags are our A/B/n flags that enable us to make data driven decisions around our feature work and are looking to test a hypothesis. If a hypothesis is correct and we want to shift our experiment into a permanent feature, these become feature rollouts and follow our feature rollout process.
As you can see, feature flags are not just a tool for progressive delivery and instant rollback for feature deployment, they can unlock a whole new faster, safer way of deploying software. However, with great power comes great responsibility and it’s important to clearly define your different flags types and governance.
Over the next six weeks, I’ll be diving deeper into how we manage each of these feature flag types at Optimizely from use cases to access controls to communication so you can leverage our best practices for building out your own feature flag types and governance model.
This is the first blog in a 7-part series that will dive more into the different feature flag types we have here at Optimizely, as well as the engineering teams that implement them. See you next time as we dive in deeper on the Feature Rollout feature flag type.
If you’re looking to get started with or scale feature flags, check out our free solution: Optimizely Rollouts!