Report

How to choose a personalization platform

And turn personalization from an ambition model into revenue.

You're not buying a personalization platform. You're buying the ability to prove personalization worked.

Six months in, the CMO asks what it returned. Click-through rates are up, and a few segments saw real lift. Then, finance asks whether any of that converted to revenue, retention, or lifetime value, and the room goes quiet.

All platforms can serve a flavor of personalization. Few can help you prove it actually worked. What separates them is whether the program you build survives a budget review.

The old taxonomy was rules-based vs. ML-driven, useful when the difference between platforms was actually one or the other. In 2026, every vendor claims ML and AI, so the taxonomy is dead.

Old way: A rule library that grows until nobody can maintain it. Campaigns that look good in demos but can't tell you whether they drove revenue. A measurement layer built on clicks because that's what the platform could measure. Data living in four different tools, reconciled nightly, trusted by nobody.

New way: Personalization that decides per visitor without a rule for every case. Audiences surfaced by AI, not defined by hand. Content variants built without a developer queue. A holdback running automatically against every experience, so when finance asks what it returned, you have an answer. Warehouse-native analytics that read directly from the place where your business metrics already live.

Optimizely was built for the new way. Named a Leader in the 2026 Gartner® Magic Quadrant™ for Personalization Engines, for the second year running.

This guide walks through what to look for, the questions that separate real platforms from polished demos, and the disqualifiers most buyers don't catch until it's already costing them.

What to look for at a glance

A checklist to take into your evaluation. Each item is covered in detail in this guide and will help you choose the right personalization platform for you, even if that’s not us.

  1. Can the platform tie any personalization to any business metric in your warehouse, in real time, without exports?

  2. Does it run holdbacks automatically against every personalization?

  3. Does the stats engine hold up to scrutiny — sequential testing, CUPED, global holdouts?

  4. Can it personalize per visitor without a rule for every segment?

  5. Does it support rules, AI audiences, contextual bandits, and A/B testing on one platform?

  6. Can a marketer ship a personalized experience, start to finish, without engineering?

  7. Does AI generate content variants in your brand voice, with code that ships clean?

  8. Is personalization delivered at the edge with no flicker?

  9. Is compliance documented and auditable?

  10. Is the total cost of ownership disclosed before you sign?


The personalization pyramid & (how to think about where you are)

Picture your customer base as a pyramid.

Broad audience segments at the base. Sharper segments above that, built on behavior, lifecycle, or persona. At the top, true 1:1 personalization where every visitor sees something tuned specifically to them.

Most programs start at the base and try to climb by writing more rules. That works until the rules become impossible to maintain, and the team ends up spending more time on segment logic than on actual personalization.

Where is your program today, and where does it need to be in twelve months?

Which tier is right for you

  • Broad segments: Same experience for all visitors in a defined group. A few rules, basic behavioral data, and a CMS that can swap content.
  • Behavioral and lifecycle segments: Experiences that shift based on what a visitor has done or where they are in the journey. Richer data, more content variants, a team with the capacity to manage the logic.
  • 1:1 per-visitor: Every visitor sees something tuned to them in real time. AI decisioning, a content layer that scales, and measurement that can prove it worked.

Most teams don't start with all three. They start with broad-segment campaigns, learn what works, and build toward sharper personalization as their data and content library grow

Where our capabilities fit at each level

Some platforms are built only for the top and require a full 1:1 strategy on day one. Others cap you at the base with a ceiling that becomes obvious around the time the rule library gets too complex to maintain.

  1. At the base: Rules-based decisioning for the segments you understand and want to control directly. A/B testing to validate which experiences work before you scale them.

  2. In the middle: AI-surfaced audiences that find behavioral clusters across your data without you predefining them. Lifecycle and persona-based targeting that updates as visitor behavior changes.

  3. At the top: Contextual bandits that allocate traffic to the best-performing experience per visitor in real time. AI agents that generate content variants at the volume 1:1 actually require. Agentic personalization that automates routine decisions so the program scales without multiplying manual work.


What separates real platforms from polished demos

A quick scan of what to look for. The deep dive on each follows below.

  • Data integration: Native reads from Snowflake, BigQuery, Databricks, and Redshift in real time. No exports, no nightly batches.
  • Full journey coverage: Rules, AI-surfaced audiences, contextual bandits, and A/B testing all running on the same platform, on the same data.
  • AI that's part of the workflow: Surfacing audiences, generating variants in your brand voice, summarizing results in plain language. Not bolted on after the fact.
  • Proof, not proxies: Automatic holdbacks, a stats engine that holds up to scrutiny, and warehouse-native analytics tying personalization to revenue.
  • Per-visitor decisioning: Traffic allocated to the best-performing variation per visitor in real time, without a rule for every case.

The question to ask every vendor:

Can this platform meet my program's needs today and grow with us as we mature?

If the answer is "we focus on the top of the pyramid" or "we focus on the base," the platform is going to fight you within a few months.

Here are the capabilities you should evaluate:

1. Holdbacks and a stats engine that prove lift

What to look for: Automated holdbacks run against every personalized experience, so a control group is always reserved. Global holdouts measure cumulative program impact across all your personalization efforts. Both run without anyone having to set them up test by test. A stats engine with sequential testing and CUPED sits underneath, so when finance asks how the lift was calculated, you have an answer that holds up.

Ask the vendor: Show me where the lift number comes from.” Draw the line from a personalized experience to the holdback that controlled for it to the warehouse table where the business metric lives.

Example: Brooks Running was losing revenue to sizing returns. After personalizing fit recommendations against their return data, the targeted segment saw an 80% reduction.

2. Warehouse-native analytics

What to look for: Native integration with Snowflake, BigQuery, Databricks, and Redshift. The platform reads from the warehouse directly, in real time, without exports or nightly batches. Any business metric in the warehouse can be tied to any personalization, including metrics the platform itself never recorded (subscription renewals, post-purchase retention, lifetime value).

Optimizely Analytics

Image source: Optimizely

Ask the vendor: How do you connect with our current stack? The answer involves moving data, the platform is rebuilding the silos you bought it to escape.

Example: Australian Red Cross saw ineffective personalization with siloed data. By tying personalization to donor history pulled directly from their warehouse, they lifted average order value by 37%.

3. Contextual bandits and AI decisioning

What to look for: A decisioning engine that allocates traffic to the best-performing variation per visitor in real time. Contextual bandits that adapt based on visitor signals (location, device, behavior, history) without you writing rules for every combination. Rules-based decisioning available alongside, for the segments you understand and want to control directly.

Contextual bandits in actionImage source: Optimizely

Ask the vendor: “Personalize for a visitor I haven't pre-segmented.” Give the vendor a hypothetical they didn't prepare for. If the answer requires you to define the segment first, you're buying an inefficient rules engine with a recommendation widget on top.

Example: Calendly needed to personalize at scale across 20 million users without the overhead of writing a rule for every segment. After implementing AI-driven decisioning, every conversion campaign resulted in significant improvements in conversion rate.

4. AI audience creation

What to look for: AI that surfaces behavioral clusters across your data and proposes them as audiences, without you predefining them. The audiences should be testable, editable, and tied to the same decisioning engine you use for rules-based segments.

Ask the vendor: “Show me an audience your platform surfaced that the customer wouldn't have built themselves.” If every example is a demographic segment that a junior marketer could have written, the AI isn't finding anything.

Example: News UK needed to convert more digital readers into paying subscribers. Through personalized checkout and paywall experiences, they drove a 39% lift in subscriptions.

5. Visual editor and AI content variants

What to look for: A visual editor that lets a non-technical user build personalized experiences without code, including variants for hero modules, landing pages, recommendations, and form flows. AI agents that generate content variants from a conversational prompt, in your brand voice, with code that ships clean.

Optimizely is the only platform with a Variation Development Agent that builds personalized content from scratch, using conversational prompts. The output is dev-ready code that doesn't slow your site down.

graphical user interface
0:00
0:00
/
0:00

Ask the vendor: ‘How would a marketer launch a personalized experience, start to finish?’ If it involves tagging in your developers, know you’ll have to add on 1-2 weeks of roadmap to every personalization campaign launch.

Example: Zoopla's product managers and designers were dependent on data analysts for every personalization test. After moving to a self-serve model, they run experiments independently without needing support from analysts.

6. Edge delivery and speed

What to look for: Personalization delivered at the edge, before the page reaches the browser, with a code snippet optimized for performance. No flicker to distract the customer or take away from the experience. Honest performance benchmarks the vendor will share, not just claim.

Ask the vendor: ‘What is the page load delta with personalization on versus off?’ If they’re unsure or unwilling, you’ll run into revenue-impacting flicker issues down the line.

Walk away from any platform that...

  • Treats engagement metrics as the only proof
  • Skips holdbacks because "you'll always see lift anyway."
  • Ships personalization on the client-side without removing the flicker
  • Calls "self-service" something that needs SQL
  • Shows AI features that don't save time
  • Caps how far you can grow
  • Requires a developer for every variation

Real AI vs. AI-washing

Every vendor will pitch you AI. Most of it is a recommendation widget with a new label. Some of it is a chatbot inside the dashboard. The rest is a feature that exists but doesn't save anyone time. The shortcut for separating real from theatre: Ask whether the AI is part of the workflow or stapled to the side of it.

  1. Audience generation: AI surfaces behavioral clusters across your data and proposes them as testable audiences. The work that would have taken your data scientist a week happens in minutes.

  2. Content variants: AI generates personalized content from conversational prompts, in your brand voice, at a volume your team couldn't produce by hand. The variants ship clean and the team reviews and approves rather than briefing from scratch.

  3. Decisioning: Contextual bandits allocate traffic to the best-performing experience per visitor, in real time, without anyone pulling the report. The system learns continuously and adapts as visitor behavior shifts.

  4. Result interpretation: AI translates experiment outcomes into plain language a stakeholder can read without a dashboard. Findings, statistical confidence, recommended next steps.

  5. Workflow: Workflow agents optimise for learning that compounds the output of an entire personalization program.

Workflow agents in actionImage source: Optimizely

In fact, as per our latest Agentic experimentation benchmark report, programs are achieving 50% more output by adopting workflow agents.

Read the full report based on lessons from 47,000+ AI interactions with actual users.

What's still marketing fluff:

  • AI that claims to run experiments end-to-end without human review
  • "Smart" automation you can't validate or override
  • AI features requiring extensive new tracking to work
  • Claims that AI replaces strategy or judgment

Questions to ask any vendor pitching AI:

  • Show me exactly where AI saves time in my current workflow. Specifics, not demos.
  • What decisions does it help us make faster, with examples?
  • Can I validate, override, and audit every AI recommendation?
  • Does it work with the data I already have, or do I need new tracking?
  • Is the AI informed by our brand and data, or is it a generic model with our logo?

If the answers are vague, the AI is vague.


Beyond the feature list

Look at:

  • Ease of adoption: Ask how long it takes a new user to ship their first personalized experience. If it's more than a week, your program will start slow and stay slow.
  • Stack fit: Ask for a list of customers running the platform with your specific stack. A thin list means a theoretical integration.
  • Compliance: Ask for GDPR and CCPA documentation in the first call. If the vendor sends a marketing page, the work hasn't been done.
  • Total cost: Year one usually runs two to three times the license fee. Ask for total cost of ownership benchmarks from comparable customers, not the list price.

What the first few months actually look like

Months 1–3: Foundation

Integrate the platform with your warehouse, CDP, CRM, and CMS. Get tracking in place for the business metrics that matter including revenue, retention, LTV, not just clicks. Train the core team and launch the first few instances of personalization against broad segments, tied to revenue from day one. Establish where you realistically are in the pyramid before trying to climb it.

Months 4–6: Scale

Open the platform up to product and marketing teams beyond the core. Add more sophisticated decisioning, with rules-based segments built around lifecycle, behavior, and persona. Run your first holdback-controlled experiments and start surfacing AI audiences to test.

Months 7–12: per-visitor

Layer in contextual bandits for high-traffic surfaces and use AI to generate content variants at volume. Tie personalization decisions to multi-month outcomes (subscription renewal, retention, LTV, not just session conversion).

At the end, focus on...

Proving causation, not just correlation

If you're earlier in your evaluation, two things worth reading:

  1. The personalization playbook: See how to deliver relevant experiences, measurable impact, and real business results.
  2. How to build a personalization strategy: The strategic framework before the platform decision

If you want to see how this works against the program you're running today, book a demo.

We'll walk through your workflow, not anything generic.