Computing metrics with event-level experiment data

In this Lab, we'll walk through an end-to-end workflow for computing a series of metrics with data collected by both Optimizely and a third party during an Optimizely Full Stack experiment.

The experiment

We'll use simulated data from the following "experiment" in this notebook:

Attic & Button, a popular imaginary retailer of camera equipment and general electronics, has seen increased shipping times for some of its orders due to logistical difficulties imposed by the COVID-19 pandemic. As a result, customer support call volumes have increased. In order to inform potential customers and cut down on customer support costs, the company's leadership has decided to add an informative banner to the atticandbutton.com homepage.

In order to measure the impact this banner has on customer support volumes and decide which banner message is most effective, the team at Attic & Button have decided to run an experiment with the following variations:

Control Message #1 Message #2
"control" "message_1" "message_2"

The challenge

Attic & Button's call centers are managed by a third party. This third party shares call data with Attic & Button periodically in a CSV file, making it difficult to track customer support metrics on Optimizely's Experiment Results Page.

In this notebook, we'll use Optimizely Enriched Event Data and our third-party call data to compute a variety of metrics for our experiment, including "Support calls per visitor" and "Total call duration per visitor".

Global parameters

The following global parameters are used to control the execution in this notebook. These parameters may be overridden by setting environment variables prior to launching the notebook, for example:

export OPTIMIZELY_DATA_DIR=~/my_analysis_dir

In this block we check whether these parameters have been passed with environment variables and assign default values otherwise. The default value for OPTIMIZELY_API_TOKEN is a read-only token for a demonstration Optimizely project.

import os
from getpass import getpass
from IPython.display import clear_output

# This notebook requires an Optimizely API token.
# The default token provided here is a read-only token associated with a demo Optimizely account
OPTIMIZELY_API_TOKEN = os.environ.get("OPTIMIZELY_API_TOKEN", "2:d6K8bPrDoTr_x4hiFCNVidcZk0YEPwcIHZk-IZb5sM3Q7RxRDafI")

# Uncomment the following block to enable manual API token entry
# if OPTIMIZELY_API_TOKEN is None:
#    OPTIMIZELY_API_TOKEN = getpass("Enter your Optimizely personal API access token:")

# Default path for reading and writing analysis data
OPTIMIZELY_DATA_DIR = os.environ.get("OPTIMIZELY_DATA_DIR", "./covid_test_data")

# Set environment variables
# These variables are used by other notebooks and shell scripts invoked
# in this notebook
%env OPTIMIZELY_DATA_DIR={OPTIMIZELY_DATA_DIR}
%env OPTIMIZELY_API_TOKEN={OPTIMIZELY_API_TOKEN}

clear_output()

Download Optimizely Enriched Event data

This notebook relies (in part) on data downloaded from Optimizely's Enriched Event Export Service.

The default input data for this notebook can be found in the in covid_test_data directory.

If you have the oevents command line tool installed and accessible in yourPATH environment variable, you may uncomment the following commands to re-download this data. Note that this will require OPTIMIZELY_API_TOKEN to be set to the default value specified above.

We'll start by downloading decision data collected during our experiment. Each decision captures the moment a visitor was added to our experiment.

# Uncomment this line to re-download decision data for this experiment 
# Note: requires oevents to be install and accessible on your path

# !oevents load --type decisions --experiment 18786493712 --date 2020-09-14

Next we'll download conversion data collected during our experiment. Each conversion event captures the moment a visitor took some action on our website, e.g. viewing our homepage, adding an item to their shopping cart, or making a purchase.

# Uncomment this line to re-download conversion data for this experiment 
# Note: requires oevents to be install and accessible on your path

# !oevents load --type events --date 2020-09-14

Load Decision and Conversion Data into Spark Dataframes

We'll use PySpark to transform data in this notebook. We'll start by creating a new local Spark session.

from pyspark.sql import SparkSession

num_cores = 1
driver_ip = "127.0.0.1"
driver_memory_gb = 1
executor_memory_gb = 2

# Create a local Spark session
spark = SparkSession \
    .builder \
    .appName("Python Spark SQL") \
    .config(f"local[{num_cores}]") \
    .config("spark.sql.repl.eagerEval.enabled", True) \
    .config("spark.sql.repl.eagerEval.truncate", 120) \
    .config("spark.driver.bindAddress", driver_ip) \
    .config("spark.driver.host", driver_ip) \
    .config("spark.driver.memory", f"{driver_memory_gb}g") \
    .config("spark.executor.memory", f"{executor_memory_gb}g") \
    .getOrCreate()

Next we'll load our decision data into a Spark dataframe:

import os
from lib import util

decisions_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=decisions")

# load enriched decision data from disk into a new Spark dataframe
decisions = util.read_parquet_data_from_disk(
    spark_session=spark,
    data_path=decisions_dir,
    view_name="decisions"
)

Now we can write SQL-style queries against our enriched_decisions view. Let's use a simple query to examine our data:

spark.sql("""
    SELECT
        *
    FROM
        decisions
    LIMIT 3
""")
uuidtimestampprocess_timestampvisitor_idsession_idaccount_idcampaign_idexperiment_idvariation_idattributesuser_ipuser_agentrefereris_holdbackrevisionclient_engineclient_versiondateexperiment
0244b48f-cd2c-45fe-86b5-accb0864aa9f2020-09-14 11:38:10.0222020-09-14 11:39:09.401user_9763-1361007105596780373188110538361878649371218802093142[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullfalse99python-sdk3.5.22020-09-1418786493712
07c73a10-0575-4990-b8d3-c5750f5b6fa12020-09-14 11:35:00.3232020-09-14 11:35:07.792user_7889-12601611596780373188110538361878649371218818611832[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullfalse99python-sdk3.5.22020-09-1418786493712
07c9d321-c336-49f4-86d0-ff6fed1d5b492020-09-14 11:29:21.832020-09-14 11:29:28.71user_45461353967797596780373188110538361878649371218802093142[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullfalse99python-sdk3.5.22020-09-1418786493712

Next we'll load conversion data:

# oevents downloads conversion data into the type=events subdirectory
conversions_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=events")

# load conversion data from disk into a new Spark dataframe
converions = util.read_parquet_data_from_disk(
    spark_session=spark,
    data_path=conversions_dir,
    view_name="events"
)

Let's take a look at our data:

spark.sql("""
    SELECT
        *
    FROM
        events
    LIMIT 3
""")
uuidtimestampprocess_timestampvisitor_idsession_idaccount_idexperimentsentity_idattributesuser_ipuser_agentrefererevent_typeevent_namerevenuevaluequantitytagsrevisionclient_engineclient_versiondateevent
01d16e55-c276-4147-b2ba-586ec55d18ee2020-09-14 10:45:01.7562020-09-14 10:45:20.535user_7468-294926545596780373[[18803622799, 18805683213, 18774763028, false]]18803874034[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullnulldetail_page_view0null0[product -> android, sku -> 456, category -> electronics]91python-sdk3.5.22020-09-14detail_page_view
0579989e-cf42-4f09-9994-35720ab4084e2020-09-14 11:37:19.0812020-09-14 11:38:18.692user_9260-1708947236596780373[[18803622799, 18805683213, 18821642160, false], [18811053836, 18786493712, 18802093142, false]]15776040040[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullnulladd_to_cart0null0[product -> android, price -> 799.99, sku -> 456, category -> electronics]99python-sdk3.5.22020-09-14add_to_cart
05c15bf4-f3a1-47e1-85ee-46c64b92caf82020-09-14 10:37:45.5752020-09-14 10:38:12.498user_31681630703036596780373[[18803622799, 18805683213, 18774763028, false]]18822540003[[$opt_bot_filtering, $opt_bot_filtering, custom, false], [$opt_enrich_decisions, $opt_enrich_decisions, custom, true...162.227.140.251python-requests/2.24.0nullnullhomepage_view0null0[]91python-sdk3.5.22020-09-14homepage_view

Compute some useful intermediate experiment datasets

In this section, we'll compute three useful intermediate experiment datasets:

  1. Enriched decisions - Optimizely decision data enriched with human-readable experiment and variation names.
  2. Experiment Units - the individual units (usually website visitors or app users) that are exposed to a control or treatment in the course of a digital experiment.
  3. Experiment Events - conversion events, such as a button click or a purchase, that were influenced by a digital experiment.

The following diagram illustrates how these datasets are used to compute metric observations for our experiment:

Transformations

Enriched decisions

First we'll use Optimizely's Experiment API to enrich our decision data with experiment and variation names. This step makes it easier to build human-readable experiment reports with this data.

The code for enriching decision data can be found in the enriching_decision_data.ipynb notebook in this lab directory.

%run ./enriching_decision_data.ipynb
Successfully authenticated to Optimizely.
Found these experiment IDs in the loaded decision data:
    18786493712

Experiment Units

Experiment units are the individual units that are exposed to a control or treatment in the course of an online experiment. In most online experiments, subjects are website visitors or app users. However, depending on your experiment design, treatments may also be applied to individual user sessions, service requests, search queries, etc.

Experiment Units Experiment Units
experiment_units = spark.sql(f"""
    SELECT
        *
    FROM (
        SELECT
            *,
            RANK() OVER (PARTITION BY experiment_id, visitor_id ORDER BY timestamp ASC) AS rnk
        FROM
            enriched_decisions
    )
    WHERE
        rnk = 1
    ORDER BY timestamp ASC
""").drop("rnk")
experiment_units.createOrReplaceTempView("experiment_units")

Let's examine our experiment unit dataset:

spark.sql("""
    SELECT
        visitor_id,
        experiment_name,
        variation_name,
        timestamp
    FROM
        experiment_units
    LIMIT 3
""")
visitor_idexperiment_namevariation_nametimestamp
user_0covid_messaging_experimentcontrol2020-09-14 11:21:40.177
user_1covid_messaging_experimentcontrol2020-09-14 11:21:40.279
user_2covid_messaging_experimentcontrol2020-09-14 11:21:40.381

Let's count the number of visitors in each experiment variation:

spark.sql("""
    SELECT 
        experiment_name,
        variation_name,
        count(*) as unit_count
    FROM 
        experiment_units
    GROUP BY 
        experiment_name,
        variation_name
    ORDER BY
        experiment_name ASC,
        variation_name ASC
""")
experiment_namevariation_nameunit_count
covid_messaging_experimentcontrol3304
covid_messaging_experimentmessage_13367
covid_messaging_experimentmessage_23329

Experiment Events

An experiment event is an event, such as a button click or a purchase, that was influenced by an experiment. We compute this view by isolating the conversion events triggered during a finite window of time (called the attribution window) after a visitor has been exposed to an experiment treatment.

Experiment Units Experiment Units
# Create the experiment_events view
experiment_events = spark.sql(f"""
    SELECT
        u.experiment_id,
        u.experiment_name,
        u.variation_id,
        u.variation_name,
        e.*
    FROM
        experiment_units u INNER JOIN events e ON u.visitor_id = e.visitor_id
    WHERE
        e.timestamp BETWEEN u.timestamp AND (u.timestamp + INTERVAL 48 HOURS)
""")
experiment_events.createOrReplaceTempView("experiment_events")

Let's examine our Experiment Events dataset:

spark.sql("""
    SELECT
        timestamp,
        visitor_id,
        experiment_name,
        variation_name,
        event_name,
        tags,
        revenue
    FROM
        experiment_events
    LIMIT 10
""")
timestampvisitor_idexperiment_namevariation_nameevent_nametagsrevenue
2020-09-14 11:23:50.677user_1283covid_messaging_experimentcontrolhomepage_view[]0
2020-09-14 11:23:50.883user_1285covid_messaging_experimentmessage_1homepage_view[]0
2020-09-14 11:24:03.368user_1408covid_messaging_experimentmessage_2homepage_view[]0
2020-09-14 11:24:21.053user_1582covid_messaging_experimentmessage_1homepage_view[]0
2020-09-14 11:24:21.053user_1582covid_messaging_experimentmessage_1detail_page_view[product -> iphone, sku -> 123, category -> electronics]0
2020-09-14 11:24:21.053user_1582covid_messaging_experimentmessage_1detail_page_view[product -> android, sku -> 456, category -> electronics]0
2020-09-14 11:24:21.053user_1582covid_messaging_experimentmessage_1detail_page_view[product -> phone case, sku -> 789, category -> accessories]0
2020-09-14 11:21:41.911user_17covid_messaging_experimentmessage_1homepage_view[]0
2020-09-14 11:24:37.315user_1742covid_messaging_experimentmessage_2homepage_view[]0
2020-09-14 11:24:56.09user_1927covid_messaging_experimentmessage_1detail_page_view[product -> android, sku -> 456, category -> electronics]0

As above, let's count the number of events that were influenced by each variation:

spark.sql(f"""
    SELECT
        experiment_name,
        variation_name,
        event_name,
        count(*) as event_count
    FROM
        experiment_events
    GROUP BY
        experiment_name,
        variation_name,
        event_name
    ORDER BY
        experiment_name ASC,
        variation_name ASC,
        event_name ASC
""")
experiment_namevariation_nameevent_nameevent_count
covid_messaging_experimentcontroladd_to_cart326
covid_messaging_experimentcontroldetail_page_view1799
covid_messaging_experimentcontrolhomepage_view3304
covid_messaging_experimentcontrolpurchase326
covid_messaging_experimentmessage_1add_to_cart338
covid_messaging_experimentmessage_1detail_page_view2414
covid_messaging_experimentmessage_1homepage_view3367
covid_messaging_experimentmessage_1purchase338
covid_messaging_experimentmessage_2add_to_cart446
covid_messaging_experimentmessage_2detail_page_view2900
covid_messaging_experimentmessage_2homepage_view3329
covid_messaging_experimentmessage_2purchase446

Compute metric observations

Metric observations map each experiment unit to a specific numerical outcome observed during an experiment. For example, in order to measure purchase conversion rate associated with each variation in an experiment, we can map each visitor to a 0 or 1, depending on whether or not they'd made at least one purchase during the attribution window in our experiment.

Unlike experiment units and experiment events, which can be computed using simple transformations, metric observations are metric-dependent and can be arbitrarily complex, depending on the outcome you're trying to measure.

Experiment Units Experiment Units

We're going to use a helper function, compute_metric_observations, to abstract away some of the redundant parts of this computation. This function takes a set of "raw observations" as input, and

  1. performs a LEFT JOIN with our experiment units in order to ensure that the resulting dataset contains an observation for every visitor in our experiment
  2. (optionally) appends the resulting metric observations to a global observations dataset

This allows us to focus on the logic for aggregating experiment events into a numerical observation, which is the most interesting part of the process.

Metric: Purchase conversion rate

In this query we measure for each visitor whether they made at least one purchase. The resulting observation should be 1 if the visitor triggered the event in question during the attribution window and 0 otherwise.

Since any visitor who triggered an appropriate experiment event should be counted, we can simply select a 1.

## Unique conversions on the "purchase" event.
raw_purchase_conversion_rate_obs = spark.sql(f"""
    SELECT
        visitor_id,
        1 as observation
    FROM
        experiment_events
    WHERE
        event_name = 'purchase'
    GROUP BY
        visitor_id
""")
raw_purchase_conversion_rate_obs.toPandas().head(5)
visitor_id observation
0 user_5967 1
1 user_1434 1
2 user_3058 1
3 user_926 1
4 user_9069 1

We'll use util.compute_metric_observations function to perform a left outer join between experiment_units and our newly-computed purchase conversion data.

observations = util.compute_metric_observations(
    "Purchase conversion rate",
    raw_purchase_conversion_rate_obs,
    experiment_units,
)

Let's take a look at our observations view:

observations.createOrReplaceTempView("observations")
spark.sql("""
    SELECT 
        metric_name,
        timestamp,
        visitor_id, 
        experiment_name, 
        variation_name, 
        observation 
    FROM 
        observations
    ORDER BY
        timestamp ASC
    LIMIT 10
""")
metric_nametimestampvisitor_idexperiment_namevariation_nameobservation
Purchase conversion rate2020-09-14 11:21:40.177user_0covid_messaging_experimentcontrol0
Purchase conversion rate2020-09-14 11:21:40.279user_1covid_messaging_experimentcontrol0
Purchase conversion rate2020-09-14 11:21:40.381user_2covid_messaging_experimentcontrol0
Purchase conversion rate2020-09-14 11:21:40.482user_3covid_messaging_experimentmessage_10
Purchase conversion rate2020-09-14 11:21:40.586user_4covid_messaging_experimentmessage_10
Purchase conversion rate2020-09-14 11:21:40.69user_5covid_messaging_experimentmessage_20
Purchase conversion rate2020-09-14 11:21:40.792user_6covid_messaging_experimentmessage_20
Purchase conversion rate2020-09-14 11:21:40.894user_7covid_messaging_experimentmessage_20
Purchase conversion rate2020-09-14 11:21:40.996user_8covid_messaging_experimentmessage_10
Purchase conversion rate2020-09-14 11:21:41.097user_9covid_messaging_experimentcontrol0

Metric observations can be used to compute a variety of useful statistics. Let's compute the value of our purchase conversion rate metric for all of the visitors in our experiment:

spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    WHERE
        metric_name = "Purchase conversion rate"
    GROUP BY
        metric_name,
        experiment_name
""")
metric_nameexperiment_nameunit_countsum(observation)metric_value
Purchase conversion ratecovid_messaging_experiment100005550.05550000000000000

Now let's compute the purchase conversion rate broken down by experiment variation:

spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    WHERE
        metric_name = "Purchase conversion rate"
    GROUP BY
        metric_name,
        experiment_name,
        variation_name
""")
metric_nameexperiment_namevariation_nameunit_countsum(observation)metric_value
Purchase conversion ratecovid_messaging_experimentmessage_133671690.05019305019305019
Purchase conversion ratecovid_messaging_experimentcontrol33041630.04933414043583535
Purchase conversion ratecovid_messaging_experimentmessage_233292230.06698708320817062

Metric: Product detail page views per visitor

In this query we count the number of product detail page views per visitor

## Unique conversions on the "add_to_cart" event.
observations = util.compute_metric_observations(
    "Product detail page views per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            count(1) as observation
        FROM
            experiment_events
        WHERE
            event_name = "detail_page_view"
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

We can inspect our observations by counting the units and summing up the observations we've computed for each experiment in our dataset:

spark.sql("""
    SELECT 
        metric_name, 
        timestamp,
        experiment_name, 
        variation_id, 
        visitor_id, 
        observation 
    FROM 
        observations
    WHERE
        metric_name = "Product detail page views per visitor"
    LIMIT 5
""")
metric_nametimestampexperiment_namevariation_idvisitor_idobservation
Product detail page views per visitor2020-09-14 11:23:26.823covid_messaging_experiment18817551468user_10480
Product detail page views per visitor2020-09-14 11:23:59.303covid_messaging_experiment18817551468user_13683
Product detail page views per visitor2020-09-14 11:24:05.094covid_messaging_experiment18818611832user_14253
Product detail page views per visitor2020-09-14 11:24:41.774covid_messaging_experiment18817551468user_17860
Product detail page views per visitor2020-09-14 11:25:30.073covid_messaging_experiment18802093142user_22620

Metric: Revenue from electronics purchases

In this query we compute the total revenue associated with electronics purchases made by our experiment subjects.

observations = util.compute_metric_observations(
    "Electronics revenue per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            sum(revenue) as observation
        FROM 
            experiment_events
            LATERAL VIEW explode(tags) t
        WHERE
            t.key = "category" AND 
            t.value = "electronics" AND
            event_name = "purchase"
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

Again, let's examine our observations:

spark.sql("""
    SELECT 
        metric_name, 
        timestamp,
        experiment_name, 
        variation_id, 
        visitor_id, 
        observation 
    FROM 
        observations
    WHERE
        metric_name = "Electronics revenue per visitor" AND
        observation > 0
    LIMIT 5
""")
metric_nametimestampexperiment_namevariation_idvisitor_idobservation
Electronics revenue per visitor2020-09-14 11:24:06.005covid_messaging_experiment18817551468user_143479999
Electronics revenue per visitor2020-09-14 11:31:45.74covid_messaging_experiment18818611832user_596779999
Electronics revenue per visitor2020-09-14 11:26:50.953covid_messaging_experiment18818611832user_305879999
Electronics revenue per visitor2020-09-14 11:23:14.427covid_messaging_experiment18817551468user_92679999
Electronics revenue per visitor2020-09-14 11:27:48.77covid_messaging_experiment18818611832user_362899999

Metric: Call center volume

We can use the same techniques to compute experiment metric using "external" data not collected by Optimizely. We'll demonstrate by loading a CSV customer support call records.

We'll start by reading in our call center data:

# Read call center logs CSV into a pandas dataframe
df = pd.read_csv("covid_test_data/call_data.csv")

# Display a sample of our call record data
df.head(5)
visitor_id call_start call_duration_min
0 user_1 9/15/2020 23:00:00 1.224888
1 user_9 9/15/2020 11:00:00 3.999876
2 user_14 9/15/2020 2:00:00 2.843966
3 user_17 9/15/2020 9:00:00 6.462949
4 user_21 9/15/2020 16:00:00 6.397294

Now let's make sure our call center data schema is compatible with the transformations we want to perform.

# Convert "call start" timestamp strings to datetime objects
df["timestamp"] = pd.to_datetime(df.call_start)

# Rename the "user_id" column to "visitor_id" to match our decision schema
df = df.rename(columns={"user_id" : "visitor_id"})

# Convert pandas to spark dataframe
call_records = spark.createDataFrame(df)

# Create a temporary view so that we can query using SQL
call_records.createOrReplaceTempView("call_records")

# Display a sample of our call record data
spark.sql("SELECT * FROM call_records LIMIT 5")
visitor_idcall_startcall_duration_mintimestamp
user_19/15/2020 23:00:001.224887792020-09-15 23:00:00
user_99/15/2020 11:00:003.9998755682020-09-15 11:00:00
user_149/15/2020 2:00:002.84396590399999962020-09-15 02:00:00
user_179/15/2020 9:00:006.462948652020-09-15 09:00:00
user_219/15/2020 16:00:006.397294292020-09-15 16:00:00

Now let's transform our call center logs into "experiment calls" using the attribution logic we used above to compute "experiment events":

# Create the experiment_calls view
experiment_calls = spark.sql(f"""
    SELECT
        u.experiment_id,
        u.experiment_name,
        u.variation_id,
        u.variation_name,
        e.*
    FROM
        experiment_units u INNER JOIN call_records e ON u.visitor_id = e.visitor_id
    WHERE
        e.timestamp BETWEEN u.timestamp AND (u.timestamp + INTERVAL 48 HOURS)
""")
experiment_calls.createOrReplaceTempView("experiment_calls")

Now we can compute metric observations for call center calls and duration!

# Count the number of support phone calls per visitor
observations = util.compute_metric_observations(
    "Customer support calls per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            count(1) as observation
        FROM 
            experiment_calls
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

# Count the number of support phone calls per visitor
observations = util.compute_metric_observations(
    "Total customer support minutes per visitor",
    raw_observations_df = spark.sql("""
        SELECT
            visitor_id,
            sum(call_duration_min) as observation
        FROM 
            experiment_calls
        GROUP BY
            visitor_id
    """),
    experiment_units_df = experiment_units,
    append_to=observations
)

Computing metric values for experiment cohorts

We can slice and dice our metric observation data to compute metric values for different experiment cohorts. Here are some examples:

Computing metric values per variation

Let's start by computing metric values broken down by experiment variation.

# Compute metric values broken down by experiment variation
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
    GROUP BY
        metric_name,
        experiment_name,
        variation_name
    ORDER BY
        metric_name,
        experiment_name,
        variation_name
""")
metric_nameexperiment_namevariation_nameunit_countsum(observation)metric_value
Customer support calls per visitorcovid_messaging_experimentcontrol3304957.00.2896489104116223
Customer support calls per visitorcovid_messaging_experimentmessage_13367649.00.19275319275319275
Customer support calls per visitorcovid_messaging_experimentmessage_23329672.00.20186242114749173
Electronics revenue per visitorcovid_messaging_experimentcontrol33041.4499837E74388.570520581114
Electronics revenue per visitorcovid_messaging_experimentmessage_133671.4959831E74443.07425007425
Electronics revenue per visitorcovid_messaging_experimentmessage_233291.9899777E75977.704115349955
Product detail page views per visitorcovid_messaging_experimentcontrol33041799.00.5444915254237288
Product detail page views per visitorcovid_messaging_experimentmessage_133672414.00.7169587169587169
Product detail page views per visitorcovid_messaging_experimentmessage_233292900.00.871132472213878
Purchase conversion ratecovid_messaging_experimentcontrol3304163.00.04933414043583535
Purchase conversion ratecovid_messaging_experimentmessage_13367169.00.05019305019305019
Purchase conversion ratecovid_messaging_experimentmessage_23329223.00.06698708320817062
Total customer support minutes per visitorcovid_messaging_experimentcontrol33044617.2328514613231.3974675700548798
Total customer support minutes per visitorcovid_messaging_experimentmessage_133672612.5089417594290.775915931618482
Total customer support minutes per visitorcovid_messaging_experimentmessage_233292663.51917903259030.8000958783516342

It looks like the average number of customer support calls per visitor and the average call duration are both much lower in the cohorts that saw one of our informational banner!

Computing metric values for a visitor segment

We can filter metric observations by visitor attributes in order to compute metric values for a particular segment.

# Compute metric values broken down by customer segment
spark.sql("""
    SELECT
        metric_name,
        experiment_name,
        variation_name,
        attrs.value as browser,
        count(1) as unit_count,
        sum(observation),
        sum(observation) / (1.0 * count(1)) as metric_value
    FROM
        observations
        LATERAL VIEW explode(attributes) AS attrs
    WHERE
        attrs.name = "browser"
    GROUP BY
        metric_name,
        experiment_name,
        variation_name,
        attrs.value
    ORDER BY
        metric_name,
        experiment_name,
        variation_name,
        attrs.value
""")
metric_nameexperiment_namevariation_namebrowserunit_countsum(observation)metric_value
Customer support calls per visitorcovid_messaging_experimentcontrolchrome1651472.00.28588734100545127
Customer support calls per visitorcovid_messaging_experimentcontrolfirefox1094321.00.29341864716636196
Customer support calls per visitorcovid_messaging_experimentcontrolsafari559164.00.29338103756708406
Customer support calls per visitorcovid_messaging_experimentmessage_1chrome1723337.00.1955890887986071
Customer support calls per visitorcovid_messaging_experimentmessage_1firefox1085207.00.19078341013824884
Customer support calls per visitorcovid_messaging_experimentmessage_1safari559105.00.18783542039355994
Customer support calls per visitorcovid_messaging_experimentmessage_2chrome1695359.00.21179941002949854
Customer support calls per visitorcovid_messaging_experimentmessage_2firefox1121223.00.19892952720785012
Customer support calls per visitorcovid_messaging_experimentmessage_2safari51390.00.17543859649122806
Electronics revenue per visitorcovid_messaging_experimentcontrolchrome16517719913.04675.90127195639
Electronics revenue per visitorcovid_messaging_experimentcontrolfirefox10944259953.03893.924131627057
Electronics revenue per visitorcovid_messaging_experimentcontrolsafari5592519971.04507.998211091234
Electronics revenue per visitorcovid_messaging_experimentmessage_1chrome17236739924.03911.737666860128
Electronics revenue per visitorcovid_messaging_experimentmessage_1firefox10855619936.05179.664516129033
Electronics revenue per visitorcovid_messaging_experimentmessage_1safari5592599971.04651.110912343471
Electronics revenue per visitorcovid_messaging_experimentmessage_2chrome16951.0199885E76017.6312684365785
Electronics revenue per visitorcovid_messaging_experimentmessage_2firefox11217119921.06351.401427297056
Electronics revenue per visitorcovid_messaging_experimentmessage_2safari5132579971.05029.1832358674465
Product detail page views per visitorcovid_messaging_experimentcontrolchrome1651912.00.5523924894003635
Product detail page views per visitorcovid_messaging_experimentcontrolfirefox1094571.00.5219378427787934

only showing top 20 rows

Computing sequential statistics with Optimizely's Stats Services

According to the metric data above, visitors who saw either of our informational banners during our experiment were less likely call support. How confident can we be that the difference in call rates can be attributed to our banner, as opposed to statistical noise?

We're working on launching a set of Stats Services that can be used to perform sequential hypothesis testing on metric observation data. You can learn more about these services and request early access here.

Writing our datasets to disk

We'll write our experiment units, experiment events, and metric observations datasets to disk so that they may be used for other analysis tasks.

from lib import util

experiment_units_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=experiment_units")
util.write_parquet_data_to_disk(experiment_units, experiment_units_dir, partition_by="experiment_id")

experiment_events_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=experiment_events")
util.write_parquet_data_to_disk(experiment_events, experiment_events_dir, partition_by=["experiment_id", "event_name"])

metric_observations_dir = os.path.join(OPTIMIZELY_DATA_DIR, "type=metric_observations")
util.write_parquet_data_to_disk(observations, metric_observations_dir, partition_by=["experiment_id", "metric_name"])

How to run this notebook

This notebook lives in the Optimizely Labs repository. You can download it and everything you need to run it by doing one of the following

Once you've downloaded this Lab directory (on its own, or as part of the Optimizely Labs repository), follow the instructions in the README.md file for this Lab.


>