AB testing with lognormal distributed data

Hi! I’m hoping for some advice, as I’m new to this :)
I have some data which I believe is distributed lognormally. I’m using a normal prior for the mean, and a cauchy prior for the standard deviation. I’m performing AB testing and calculating two values: the probability that the mean of the B variant is greater than the mean of the A variant, and the lift that I could expect if I chose the B variant. Unfortunately, I can’t share the data. The Stan code is as follows:

data {
    real mu_prior;
    real sigma_prior;

    int<lower=0> control_n;
    vector<lower=0>[control_n] revA;
    
    int<lower=0> var_n;
    vector<lower=0>[var_n] revB;
}

parameters {
  real muA;
  real<lower=0>sigmaA;
  real muB;
  real<lower=0>sigmaB;
}

model {  
  muA ~ normal(mu_prior, 2);
  sigmaA ~ cauchy(sigma_prior, 3); 
  muB ~ normal(mu_prior, 2);
  sigmaB ~ cauchy(sigma_prior, 3);
  
  revA ~ lognormal(muA, sigmaA);
  revB ~ lognormal(muB, sigmaB);
}

generated quantities {
//difference in means is the quantity of interest
  real mu_diff;
  real post_revenue_a;
  real post_revenue_b;
  real revenue_diff;

  mu_diff = muB - muA;
  post_revenue_a = lognormal_rng(muA, sigmaA);
  post_revenue_b = lognormal_rng(muB, sigmaB);
  revenue_diff = post_revenue_b - post_revenue_a;
  }

Sometimes, I get high probability that variant B has higher mean than variant A, but a negative average of the revenue_diff term. Does this make sense?

1 Like

Hey! Note that \mu_A and \mu_B are estimates for the mean of \log(\text{rev}_A) and \log(\text{rev}_B). To get an estimate for the mean of \text{rev}_A for example, you need the calculate \exp(\mu_A+\sigma^2_A/2), or exp(muA + 0.5*square(sigmaA)) in Stan code. An estimate for the median of \text{rev}_A would be just \exp(\mu_A). So you are more or less comparing the medians of the whole distributions if you just look at the \mu 's. And depending on the spread and tails of the two distributions, maybe revenue_diff could take an unexpected sign.

I’d just plot the two distributions next to each other and compare visually (if that’s feasible) to see if your results makes sense. Hope this helps! :)

Cheers,
Max

1 Like

Oh wow, thank you! I think this fixed it :)

1 Like

In addition to what @Max_Mantei said; if the “sometimes” here refers to “for some draws from the posterior” or “for some datasets where you do not have a lot of data”, the prior on the sds could be the driver of what you are seeing. I would not advice to use a heavy tailed distribution like the half-cauchy for the scale of a log normal. Like @Max_Mantei said the mean is e^{\mu + \sigma^2/2} or e^{\mu} e^{\sigma^2/2}. In other words, the heavy-tailed prior affects the mean as a multiplicative factor after squaring and exponentiation. You would need a good amount of data to overcome that prior.

1 Like

Very good point, @stijn! Thanks!

Thanks for pointing that out! Do you have a suggestion for a looser prior?

It really depends on what scale your outcome is. I typically start with normal(0, 1) and do some simulations in R to see what that gets me. I would start investigating the effect of the prior on the factor e^{\sigma^2/2}.

> mean(factor > 1e3)
[1] 2e-04
> mean(factor > 1e2)
[1] 0.00222
> mean(factor > 1e1)
[1] 0.03173
> mean(factor > 5)
[1] 0.07164
> mean(factor > 2)
[1] 0.23969

For instance, the last line means that your prior implies a 24% chance that your prior leads to a factor larger than 2.

This is for the Cauchy prior. You see that extreme (?) outcomes are much more likely.

sds <- abs(rcauchy(1e5))
> factor <- exp(sds^2/2)
> mean(factor > 1e3)
[1] 0.16561
> mean(factor > 1e2)
[1] 0.20097
> mean(factor > 1e1)
[1] 0.27571
> mean(factor > 5)
[1] 0.32131
> mean(factor > 2)
[1] 0.44584

For a more structured way of doing this, you can look up posterior prior predictive checks.

2 Likes