How Product Teams Test, Learn and Adapt with Data
Recently at Test & Learn, Optimizely’s second annual Progressive Delivery and Experimentation virtual summit, Product and Engineering leaders from across the globe shared how they employ experimentation, analytics, and customer discovery to deliver better software with data. Below are the techniques they shared to empower every Product team to achieve higher engagement, growth, and retention. It’s all about asking bold questions, taking measured risks, and using data to make product decisions with confidence.
Ask beautiful questions to find growth opportunities.
According to Amplitude’s Tanner McGrath, Head of Product Analytics, and Bilal Mahmood, Head of Machine Learning, a company must stay curious and adapt to survive. It comes down to asking “Beautiful Questions” that shine a spotlight on opportunities with your customers and product. Most questions lead to more questions creating an “engine of exploration and discovery.” This process leads to good questions, which lead to great questions, and eventually, the best questions emerge. McGrath asserts that the answers to beautiful questions provide insights that impact your business in the realm of 10X. Finding those answers needs to be as frictionless as possible. Product experimentation is a powerful way to create a shortcut for answering your questions and determining which of them are truly “beautiful,” inspiring growth and innovation.
In his talk “How to Run High-impact Experiments with High-quality Customer Discovery,” Oji Udezue, VP of Product at Calendly, maps an innovation loop driven by “Why What, and How” questions. Getting the answers to these questions wrong comes with a varying degree of risk, according to Udezue.He recommends leveraging customer discovery and experimentation to answer, “Why and What” inquiries. Getting insights with reliable data early on reduces the risk of taking your product and business down the wrong path.
Align metrics for success.
To set the stage for “The Product and Engineering Guide to Retention Experiments,” Aleksandra Vangelova, Product Owner at HelloFresh and Optimizely customer, defines her North Star Metric: Cohort ROI, which shows how conversion and customer profitability are the focal points of their business. When it comes to measuring retention, cohorts, groups of new users bucketed by the dates on which they started using the product, are an essential place to start because you control for biases when you introduce changes by cohort. Aligning your North Star metric to cohorts focuses your business on customer retention. As a leading indicator of profitability, retention metrics help HelloFresh measure the long-term impact of experiments and product changes.
Luis Trindade, Principal Product Manager at the global fashion retailer, Farfetch, provided a process for evaluating experiments against success metrics. Trindade promotes using an experiment to measure the impact of your product changes and focusing on the near-term effects for clarity. But don’t get caught in the trap of measuring the change itself. Instead, measure the consequences. For example, if you introduce a loyalty program to your customers as an experiment to improve LTV, sign-ups and purchases are an interesting near-term metric for measuring engagement. But what did the users do next? Do they return orders more frequently? What is their average order value? You also must look down-funnel and segment user behavior over time and across audience segments to fully understand the experiment’s impact on revenue. Check out the recording of his full session to learn how to estimate the long-term impact implementing a winning test has on your metrics.
Be rigorous in your data quality standards.
When you are looking to answer difficult questions, or make big decisions, statistically significant data is the key to certainty. In their respective talks, both Calendly’s Udezue and Vangelova from HelloFresh recommend using a sample size calculator to measure how much traffic and time you will need to reach significance in your tests. Lack of product traffic doesn’t have to be a blocker. Optimizely helps you weed out inconclusive experiment variations early on with Stats Accelerator by reducing the time to get actionable results. It uses machine learning to automatically allocate more traffic to experiment variations that show early promise of yielding impactful results. The outcome is a faster path to statistical significance.
Michael Lindon, Optimizely Staff Statistician, took the topic of rigor a level deeper in his talk about steering clear of sample ratio mismatch (SRM) a common A/B testing pitfall that is symptomatic of a range of potential data quality problems. For example, if your experiment is set to 50% traffic to one variation and 50% traffic to another variation, if you saw a skew of 45%/55%, this would suggest an SRM issue
Lindon explains that SRM is a common issue that arises from experiments that incorrectly implemented experiments. Executing a product experiment requires a tremendous amount of infrastructure, with that come countless opportunities for bugs. For example, if users are not being consistently bucketed into an experiment variation and are converting under the control conditions as well as the variant, the data collected for that user is no longer reliable for measuring the impact of the experiment. The primary reason you should be concerned is when invalid data results from SRM, bias can impact your metrics and go undetected.
The good news is that Optimizely open-sourced a sequential testing solution anyone can implement to detect SRM to ensure you can trust your experiment data.
Watch Lindon’s talk and all of the Test and Learn sessions on demand.