Hierarchical model - Estimating conversion rates for hotels

Hi everyone,

I have a small dataset from Google AdWords with the number of bookings and clicks per hotel. The idea is to estimate the conversion rates by a hierarchical model. Specifically, some hotels are located within the same region, and, thus, one would expect to share some similarity as to conversion rates.

I have written a Stan model to estimate the conversion rates, however, I’m still getting a few warnings. Specifically, E-BFMI is very low even though the Rhat are all close to 1. When I look at the posteriors and effective number of samples, I witness a very good mixing and can’t identify any pathological behavior.

Additionally, when I remove the data from hotels with zero bookings it runs flawlessly. Needless to say, adapting the data to get the expected outcome does not seem to be the right approach.

My questions are:

  1. How can I re-parametrize my model? A lot of the hotels have no bookings and thus have a null conversion rate. Sampling in this regions is rather difficult. Do you have any suggestions on how to sample in regions?

  2. Any literature you can recommend on “reparameterization”? Somehow, it feels like a black voodoo magic. From the current diagnosis I can’t really develop an intuition on how to reparametrize any model.

Any help is more than welcome.

The code and data can be found here:

data {
  int n_hotels;
  int n_regions;
  int<lower=1> hotel2region[n_hotels];
  int<lower=0> n_flips[n_hotels];
  int<lower=0> n_heads[n_hotels];

parameters {
  vector<lower=0, upper=1>[n_hotels] theta;
  vector<lower=0, upper=1>[n_regions] phi_cat;
  real<lower=0, upper=1> phi_pop;
  real<lower=1> kappa_cat;
  real<lower=1> kappa_pop;

transformed parameters {
  vector[n_hotels] alpha;
  vector[n_hotels] beta;
  vector[n_regions] al_cat;
  vector[n_regions] be_cat;
  real al_pop;
  real be_pop;
  al_cat = kappa_cat * phi_cat;
  be_cat = kappa_cat * (1 - phi_cat);
  alpha = al_cat[hotel2region];
  beta = be_cat[hotel2region];
  al_pop = kappa_pop * phi_pop;
  be_pop = kappa_pop * (1 - phi_pop);

model {
// weak hyperpriors
  kappa_cat ~ pareto(1, 1.5);
  kappa_pop ~ pareto(1, 1.5);
  phi_pop ~ beta(0.5, 0.5);
// priors
  theta ~ beta(alpha, beta);
  phi_cat ~ beta(al_pop, be_pop);

// Likelihood
  n_heads ~ binomial(n_flips,theta);