Geometric distribution AA test failing

I am trying to do AA test on my data that follows a distribution most close to the geometric distribution

As there are no methods in stan for geometric distribution, I am using a custom likelyhood function for the geometric distribution

data {
  int<lower=0> NA; // Number of trials in group A
  int<lower=0> NB; // Number of trials in group B
  int<lower=0> yA; // Number of successes in group A
  int<lower=0> yB; // Number of successes in group B
}

parameters {
  real<lower=0, upper=1> pA; // Probability of success for group A
  real<lower=0, upper=1> pB; // Probability of success for group B
}

model {
  // Prior distributions for probabilities (Beta prior)
  pA ~ beta(1, 1); // Uniform prior
  pB ~ beta(1, 1); // Uniform prior

  // Likelihood for group A and B
  for (n in 1:NA) {
    target += log(      pA * (1-pA)^(yA[n]-1)      );
    }

  for (n in 1:NB) {
    target += log(      pB * (1-pA)^(yB[n]-1)      );
    }

 
}

generated quantities {
  real mean_a;
  real mean_b:
  real upt_diff;
  
  mean_a = 1/pA;
  mean_b = 1/pB;

  upt_diff = mean_b - mean_a ;
  p_variant_beating_control = upt_diff > 0 ;
  p_control_beating_variant=  upt_diff < 0 ;
 
}

Unfortunately, the result gives a lot of false positives.

Can anyone please give suggestions for improving this

You have pA in both distributions, I take that this shouldn’t be the case. Other than that, I guess the number of observations in each group could skew the results by scaling them wrongly – you are adding both to the same target, and k=4 for one distribution may have a different impact than on the other (prior choices are possibly not the best either, but I guess that would be a minor issue comparatively).

While the geometric distribution PMF is simple and efficient to compute, if you suspect that there is a bug in your custom distribution, it could be easier to use the Negative Binomial distribution. Since the current Stan implementations don’t seem to use the success (or failure, depending on how you define them) probabilities p, you will need to reparameterize the distribution, but that should be straightforward, then you can choose the dispersion parameter r=1 which will simplify the NegBinom to a geometric distribution.

Hi, @Siddhart_Somani and welcome to the Stan forums.

We should add these. Its pmf is a special case of the multinomial, where it’d look as follows.

{1, yA[n] - 1} ~ multinomial(yA[n], [pA, 1 - pA]');

If you want to code it directly as you did,

target += log(      pA * (1-pA)^(yA[n]-1)      );

I’d recommend using the following form for arithmetic stability.

target += log(pA) + (yA[n] - 1) * log1m(pA);

As you had it written, it will underflow to zero for large yA[n] values and then the log will be negative infinity and Stan won’t be able to make progress. The log1m(pA) will also stabilize that computation if pA is near zero.

1 Like