I’m interested in using Stan to estimate probabilities for AB tests with multiple variants and I’ve been following this pymc tutorial.

I’ve managed to translate between pymc and Stan okay, but I’m reaching a problem in the Hierarchical Modeling section, where the author defines hyperpriors for the beta distribution using guidance from BDA. I can’t figure out how I might go about doing this within Stan. I’ve been looking around, but can’t find any examples for anything other than pymc.

Can anybody give me any pointers on how I might go about doing this?

I’d like to ask you to do a little more legwork. It looks like a custom density calculation so you can just code it directly in Stan by incrementing the density with target +=. Can you be more specific about what is unclear?

When I try to fit this model I get warnings about divergent transitions, even with a high value for adapt_delta as the Stan manual suggests, and the posterior for b, but not a, tends to have an extremely long tail. How worried should I be about this?

I wouldn’t worry about it but I wouldn’t use the model. Custom densities can be funny so you might want to plot this thing you’re using. In fact, I would try this model without data (so drop the part about the binomial and when yoiu sample you’ll be simulating from the prior). That way you can see what prior your density creates for p1/p2.

One point about coding: a and b only exist on positive reals (except when they underflow to zero) in which case the log density will be negative infinity and get rejected. That’s a long way of saying that since you’ve correctly declared constraints on a/b you don’t need the if-else statement, just increment the target density as you are doing.

I’m sure someone more familiar with this density could just give you an answer but this is the process (well, one process) to get there either way.

I answered this elsewhere but don’t know where. I cover it in my repeated binary trials case study (web site >> users >> documentation >> case studies). I also coded it in the manual in the “reparameterizations” section. The density you want is the Pareto—it’s built in. Andrew just worked it out from the Jacobian in BDA chapter 5, but he doesn’t really recommend that prior. The case study shows you how to convert that to an intercept-only logistic regression, which is then much easier to generalize and is amenable to more general (multivariate) priors.