Numeric overflow for Dirichlet distribution with random effect using ADVI

I am trying to fit the following Bayesian model in Stan using ADVI:

\mathbf{Y}_{ij} \sim \mbox{Dirichlet}(\alpha_i \mathbf{1}),
\alpha_i \sim \mbox{Gamma}(1, 1),

where \mathbf{Y}_{ij} is a 5-dim vector for i = 1, 2, 3 and j = 1, \dots, 20. When \alpha degenerates to a scalar, the model works fine. However, whenever I want different \alpha values for the Dirichlet distribution, I keep getting the following error:

Error in sampler$call_sampler(c(args, dotlist)) :
Exception: Error in function boost::math::lgamma(double): numeric overflow (in ‘model440468d045de_test_test’ at line 21)

A few notes:

  1. My actual model is more complicated than this, but it got stuck at this Dirichlet distribution thing.

  2. It seems that Stan does not allow
    vector[] ~ Dirichlet(vector[]).
    Therefore whenever I am varying the Dirichlet parameters I receive the above error message. How can I get around this issue? I really want to vary my Dirichlet parameters.

  3. Another issue, which is more related to the vb function, is that I realize that sometimes a model that works perfectly well crashes when I increase the number of iterations to adapt the stepsize (adapt_iter). This doesn’t make sense to me, since I thought more iterations to adapt the stepsize should yield better (more stable) result. Also, if whether a model works or not depends on the number of iterations I specified, then how do I run simulation studies? (I understand that vb is still an experimental algorithm, but I really hope I can use it for my simulations.)

Any comments are appreciated.

Here is my Stan model:

data {
  int<lower=0> N; //# of observations = 60
  int<lower=0> K; //Dimension of Y = 5
  int<lower=0> I; //# of Groups = 3
  int<lower = 0, upper = I> x[N];
  simplex[K] y[N];
}
parameters {
  real<lower = 0> alpha[I];
}
transformed parameters {
  vector<lower = 0>[K] dirch_temp[N];

  for (n in 1:N) 
    dirch_temp[n] = rep_vector(alpha[x[n]], K);
}
model {
  alpha ~ gamma(1, 1);
  for (n in 1:N) {
    y[n] ~ dirichlet(dirch_temp[n]);
  }
}

And here is my R code:

y = matrix(NA, nrow = 60, ncol = 5)
for (n in 1:20){
  y[n, ] = rgamma(5, 1, 1)
  y[n, ] = y[n, ] / sum(y[n, ])
}
for (n in 21:40){
  y[n, ] = rgamma(5, 1.1, 1)
  y[n, ] = y[n, ] / sum(y[n, ])
}
for (n in 41:60){
  y[n, ] = rgamma(5, .9, 1)
  y[n, ] = y[n, ] / sum(y[n, ])
}

test_data <- list(N=60,
                  K = 5,
                  I = 3,
                  x = c(rep(1,20), rep(2,20), rep(3, 20)),
                  y = y)

test_model = stan_model(file = 'nmf_test_temp.stan')
nmf_vb = vb(test_model,
            data = test_data,
            seed = 235432)

My guess is that you are going to need a gamma prior with a shape parameter sufficiently greater than 1 in order to keep the lgamma function from overflowing near zero.

As for this,

you are putting too much faith in the adaptation.

Thank you for your response!

I changed the gamma prior to \gamma \sim \mbox{Gamma}(100,1) and still got the overflow message. I even tried to generate the data from y \sim \mbox{Dirichlet}(100 \times \mathbf{1}) and the problem still exists.

As for the adaption, I just don’t understand why sometimes changing the number of adaptation steps can cause the model not working. Is this just because the vb function is still not stable, or is it really an issue with the adaptation?

One of the reasons vb is not stable is because the adaptation is not great. You could also try putting it a line like

model {
  for (i in 1:I) if (alpha[i] == 0) reject("underflow");
  ...

I’m sorry, but I’m still getting the numeric overflow error…

I have similar issues. It seems like dirichlet does not accept a random variable(parameter) or something derived from as an input alpha vector.

I can put hard constraints on the value to be way above 0 and I still get this error.

I just replaced the dirichlet with a stick breaking procedure.