Bayesian Statistics for A/B test analysis

Traditionally frequentist methods where used for A/B test analysis. Nowadays, due to various advantages, they are replaced by so called Bayesian statistics.

In this article we wont go into detail about frequentist methods, there are enough resources out there, but we will highlight why we chose Bayesian statistics and why Bayesian statistics are especially interesting for autonomous website optimization.

What is the Bayesian approach?

In this approach, we create two models, Ma and Mb (one for each variant) and then compare them. These models, created based on experimental data, randomly generate the samples: A and B. We use these models to generate samples of possible rates and calculate the difference between these rates. The goal is to estimate the distribution of the difference between the two processes.

Unlike the frequentist approach, this one compares two models. This is called the Bayesian approach.

Now we need to create a model for A and B.

Clicks can be represented as binomial distributions with the parameters number of trials and success rate. In digital experiments, the number of trials corresponds to the number of visitors and the success rate corresponds to the click or transaction rate. In this case, it is important to note that the rates we are interested in are only estimates for a limited number of visitors. To model this limited precision, we use beta distributions (equivalent to the conjugate a priori distribution of binomial distributions).

These distributions model the probability of a success rate measured over a limited number of trials.

For example:

1,000 visitors to A with 100 successes 1,000 visitors to B with 130 successes

We create the model Ma = beta(1+success_a,1+failure_a), where success_a = 100 & failure_a = visitors_a - success_a = 900.

You will have noticed a +1 for the success and failure parameters, which can be explained by the “prior” in Bayesian analysis. A prior is something you know before the experiment. For example, something derived from another (earlier) experiment. However, it is well documented in digital experiments that click-through rates are not constant and can change depending on the time of day or season. Therefore, we cannot use this in practice. The corresponding prior setting of +1 is simply a non-informative prior, since you have no previous usable experimental data to fall back on.

Sources:

https://towardsdatascience.com/why-you-should-switch-to-bayesian-a-b-testing-364557e0af1a

https://docs.growthbook.io/assets/files/GrowthBookStatsEngine-d239dd518fdfa7198be46489bb65b8e3.pdf