Integrating Builder.io with another A/B testing system

Builder.io is easy to customize to sync with another a/b testing system - for instance Optimizely, LaunchDarkly, Dynamic Yield, or a custom one. In this post we’ll walk you through a few options to set this up, based on your need can set this up.

Option 1 - custom field & query

One of the simplest options is just to create a custom field to choose/enter an a/b test group set up in your test provider (Optimizely, LaunchDarkly, etc). This can either be a value you can enter manually, or create a custom field plugin for fetching the list possible options and presenting the choices.

You can then use queries to filter content matching the correct test groups.

Option 2 - targeting

This option is similar to the above option, but can be simpler for some use cases, esp when matching content to test groups is purely optional.

For instance, you can make a custom targeting field called “testGroup”. Here you can use targeting to assign pieces of content to be chosen when a given user is assigned to a specific test group, e.g. in the Builder.io by choosing “test groups” “contains” [choose test group ID]. You can also use a custom plugin to make choosing the group in the targeting UI easy as well.

Then just send test groups as user attributes, e.g.

builder.setUserAttributes({ 
  testGroups: ['abc123'].sort().join(',') // Sorting ensures best cache efficiency
});

Option 3 - webhooks

Use this option if you want the tests to be created directly in Builder as A/B test variations in our UI, as opposed to separate content entries that have to be connected to test groups also manually connected in your A/B testing system. This can be the most seamless for editors in Builder.

To use this option, you’ll want to use webhooks to listen to content changes in Builder. Anytime content in Builder has A/B tests there will be a variations object on the JSON we send.

{
  /* Main content ID */
  "id": "abc123",
  /* Main content data */
  "data": { /* ... */ },
  /* Map of variation ID to variation content  */
  "variations": {
    /* Test group ID */
    "xyz789": {
      /* Test group ID */
      "id": "xyz789",
      /* Test variation name */
      "name": "My test name",
      /* Test variation tarffic percent */
      "testRatio": "0.5",
      /* Variation data */
      "data": { /* ... */  }
    }
  }
}

You can read this to see that the above content has a 50/50 test of the default content vs one variation. You can then sync this to your A/B testing system writing the variation names and IDs.

Then, when users load your site, you can hit your A/B testing system for which groups a user should be in. You can instruct Builder on which test groups to load by setting a cookie of the format

setCookie(`builder.tests.${contentId}`, variationId)

Builder will read this cookie and use the provided test variation.

Note that for the default variation, the variation ID is the content ID - so to say load the default variation set

setCookie(`builder.tests.${contentId}`, contentId)

One downside to each of these solutions is that the client doesn’t know which experiments Builder cares about, so it has to pre-compute the user’s treatment for every experiment – even experiments the user never sees. That will change the denominator of the experiment, corrupting the results.

In order to get statistically significant results in the experiment, the user has to be bucketed for experiment X if and only if Builder is asked to target content based on X.

If the third-party experimentation system allows you to compute the user’s treatment for X without actually recording them being bucketed, you could do the following:

  1. pre-compute the treatment for every experiment without recording bucketing
  2. in Builder, add some JS to record the bucketing

If the third-party system doesn’t support separating those concerns, the only way to do it would be to have Builder do the lookup asynchronously before rendering the content.

Hey @jamesarosen!

The approaches outlined above are ways to fetch and organize Builder content based on user attributes, and rely on knowing which “bucket” the user is in. It would not involve using Builder’s logic to bucket a user. Depending on what your A/B test system provides to bucket users, you would then just take that user attribute and pass it to Builder to retrieve the proper content for the specific variation.

I fully agree with:

but I am a little confused as to why that would be an issue with using the three different options. The Builder content should only be requested for users that are in the specific bucket. I would assume the A/B test system has logic to determine when to bucket people (say based on user attributes or URL) and that it handles only bucketing for when you are to see the item. Then you just pass those attributes in your call to request Builder content.

I am probably missing something here, so please let me know if I misunderstood!

Let’s imagine we’re running the buy_it_again experiment: for users who have already ordered from us, does showing a “buy it again” message on their cart page increase sales?

That is, we want to compare

revenue by returning users shown buy-it-again
---------------------------------------------
# of users who _could_ have seen buy-it-again

with

revenue by returning users not shown buy-it-again
-------------------------------------------------
  # of users who _could_ have seen buy-it-again

With an integrated segentation-and-content-and-event-tracking system, the client can just ask the system “which content should we show on the check-out page?” For non-returning users, the system will return “Thanks for becoming a customer!” and do nothing else. For returning users, it will roll the dice and select either “Thanks for coming back!” or “Here’s what you bought last time: … Need another?” Only in this second case will it record the fact that it rolled the dice. Notice that the front-end doesn’t actually need to roll the dice, record whether the dice were rolled, determine whether it should roll the dice based on audience, or anything else. This is how systems like Optimizely and Launch Darkly work.

With the content and experimentation systems disconnected, we need to be more careful about computing and recording. The UI doesn’t know whether Builder might use buy_it_again for any given content, so it has to pre-roll the dice before it fetches any content from Buidler. But it can’t send the fact that it rolled the dice to the experimentation framework ahead of time. That could result in counting first-time buyers and even users who browse but never buy in the denominator. Instead, we need to delay recording that dice-roll until the exact moment it could affect user behavior: when Builder actually uses it to generate content.

One general-purpose solution would be for Builder to return all the targeting information it used back to the client. The client can then be responsible for digging through that and recording the dice-rolls:

{
  message: [
    data: {…},
    meta: {…},
    query: [
      { operator: "is", property: "countryCode", value: [ "CA" ] },
      { operator: "is", property: "device", value: [ "desktop" ] },
      { operator: "is", property: "experiment:buy_it_again", value: [ "false" ] },
      { operator: "is", property: "experiment:sms_activation", value: [ "false" ] },
      { operator: "is", property: "returning", value: [ "true" ] },
    ],
    targeting: { returning: true, "experiment:buy_it_again": false },
  ]
}

Without a general-purpose solution like the ones above, you’d have to have a programmer modify the surrounding code so it knows under exactly what circumstances Builder will ask for bucketing information. But that means that site editors need to synchronize their content changes with engineering deploys – exactly what Builder is meant to stop.