Problems estimating a negative binomial multilevel model

I am building a multilevel model for panel data with three layers. Since my dependent variable is a count variable I want to construct a negative binomial multilevel model with three layers. My model looks like the following:

functions {
    * Alternative to neg_binomial_2_log_rng() that 
  * avoids potential numerical problems during warmup
    int neg_binomial_2_log_safe_rng(real eta, real phi) {
      real gamma_rate = gamma_rng(phi, phi / exp(eta));
      if (gamma_rate >= exp(20.79))
        return -9;      
      return poisson_rng(gamma_rate);
data {
  // Define variables in data
  // Number of level-1 observations (an integer)
  int<lower=0> Ni;
  // Number of level-2 clusters
  int<lower=0> Nj;
  // Number of level-3 clusters
  int<lower=0> Nk;
  // Numer of models x countries
  int<lower = 0> Njk;
  // Number of fixed effect parameters
  int<lower=0> p;
  // Number of random effect parameters
  int<lower=0> Npars;
  // Variables with fixed coefficient
  matrix[Ni,p] fixedEffectVars;
  // Cluster IDs
  int<lower=1> modelid[Ni];
  int<lower=1> countryid[Ni];
  // Level 3 look up vector for level 2
  int<lower=1> countryLookup[Npars];
  int<lower=1> countryMonthLookup[Njk];
  // Continuous outcome
  int activations[Ni];

parameters {
  // Define parameters to estimate
  // Population intercept (a real number)
  real beta_0;
  real<lower=0> inv_phi;
  // Fixed effects
  vector[p] beta;
  // Level-1 errors
  real<lower=0> sigma_e0;
  // Level-2 random effect
  real u_0jk[Npars];
  real<lower=0> sigma_u0jk;
  // Level-3 random effect
  real u_0k[Nk];
  real<lower=0> sigma_u0k;

transformed parameters  {
  // Varying intercepts
  real beta_0jk[Npars];
  real beta_0k[Nk];
  real phi = inv(inv_phi);
  // Varying intercepts definition
  // Level-3 (10 level-3 random intercepts)
  for (k in 1:Nk) {
    beta_0k[k] = beta_0 + u_0k[k];
  // Level-2 (100 level-2 random intercepts)
  for (j in 1:Npars) {
    beta_0jk[j] = beta_0k[countryLookup[j]] + u_0jk[j];

model {
  // Prior part of Bayesian inference
  // Random effects distribution
  u_0k  ~ normal(0, sigma_u0k);
  u_0jk ~ normal(0, sigma_u0jk);
  beta[1] ~ normal(-0.25, 1);
  beta[2] ~ normal(0, 1);
  beta[3] ~ normal(0, 1);
  beta[4] ~ normal(0.25, 1);
  beta[5] ~ normal(-0.25, 1);
  beta[6] ~ normal(0.25, 1);
  sigma_u0jk ~ normal(0,1);
  sigma_u0k ~ normal(0,1);
  inv_phi ~ normal(0, 1);
  // Likelihood part of Bayesian inference
  for (i in 1:Ni) {
    activations[i] ~ neg_binomial_2_log(beta_0jk[countryMonthLookup[i]] + fixedEffectVars[i,]*beta, phi);

generated quantities{
  real y_rep[Ni];
  for (i in 1:Ni) {
    real eta_n = beta_0jk[countryMonthLookup[i]] + 
    y_rep[i] = neg_binomial_2_log_safe_rng(eta_n, phi);

When I try to estimate the model the performance (fit of the model in terms of in-sample predicitons) is extremely bad and I get the following error messages: Exception: normal_lpdf: Scale parameter is 0, but must be > 0!
at this line
activations[i] ~ neg_binomial_2_log(beta_0jk[countryMonthLookup[i]] + fixedEffectVars[i,]*beta, phi);.

The strange thing is that if I assume a normal distribution for y_{ijt}, so neg_binomial_2_log(beta_0jk[countryMonthLookup[i]] + fixedEffectVars[i,]*beta, phi); replaced by normal(beta_0jk[countryMonthLookup[i]] + fixedEffectVars[i,]*beta, sigma_e0);, the predictions are great and I don’t get any error message.

Does anyone know what is going wrong here? Unfortunately, I cannot post any data for my model, but I checked it and there are no missing values or anything like that.

Any helk is really appreciated!

I’m guessing this error is happening because sigma_u0k or sigma_u0jk are going to zero. Is this true? I assume if you move the lower bounds on them up from 0.0 to like 1e-2 the error will go away? If so, which one is causing the issue? Are any of the other parameters behaving weirdly?

Is the model behaving like it’s nearly working (a few divergences, just a couple parameters off, Rhats near 1), or does it look like it’s off in crazy-land (tons of divergences, Rhats of like 2+, all parameters in weird places)?

@bbbales2 Thanks for your reply!

My model indeed seems like it is nearly working; the parameter estimates are not very weird and the Rhat values are close to 1 (around 1.05). However, I do get a lot of divergences, when doing 5000 draws with 2500 warmup and 2500 sampling draws I get 2336 divergent transitions after warmup. This does not seem right to me. This includes your approach of setting the lower bounds of sigma_u0k and sigma_u0jk to 1e-2.

Furthermore, the values of sigma_u0k and sigma_u0jk are then 0.67 and 0.14 respectively. So, not really 0.

Do you have any suggestions on what else I can try?

Ooh, if it’s divergences then let’s non-center those random effects.

Get rid of u_0k in parameters and replace it with u_0k.

Define the prior on u_0kz in the model like:

u_0kz ~ normal(0, 1)

And then define u_0k as a transformed parameter like:

u_0k = sigma_u0k * u_0kz;

Do the same for u_0jk and let’s see if those divergences go away.

What do you exactly mean by “Get rid of u_0k in parameters and replace it with u_0k.”? What do I have to replace u_0k with?

My bad I mean replace with u_0kz. Like:

parameters {
  real u_0kz[Nk];

transformed parameters {
  real u_0k[Nk];

  for(i in 1:Nk) {
    u_0k[i] = sigma_u0k * u_0kz[i];

model {
  u_0kz ~ normal(0, 1);

Something like that

I tried your approach but unfortunately I still get the following error messages:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:" [2] "Exception: neg_binomial_2_log_lpmf: Log location parameter is inf, but must be finite!

and this

"Exception: neg_binomial_2_log_lpmf: Precision parameter is 0, but must be > 0!

Hmm, that’s interesting. With infs floating around it’s probably best to just print things until you find what’s going wrong.

For instance, in the model you can do:


And make sure that u_0k isn’t infinity. If it is, then work from there I guess. I had a look at my code and I think it’s right but that wouldn’t be the first coding mistake I’ve made :P.

Why when you replace the neg_binomial with a normal do you swap out phi with sigma_e0?

I do not see you using sigma_e0 anywhere in you model. Perhaps phi is going to zero since inv_phi is set lower bound but that can allow 1/inv_phi to become arbitrarily small? Maybe get rid of inv_phi and use phi directly, setting lower bound on that parameter.

1 Like

Yes, I indeed swap phi with sigma_e0 in my model. In the negative binomial model I do not use sigma_e0 at all. Is this wrong?

I will try your approach!

Not necessarily wrong but they have different priors and limits. The sigma one has a specific lower bound. Not sure how inverting affects the lower bound for phi.

@ezke I replaced inv_phi with only phi, but unfortunately I still get the huge amount of divergent transitions. Do you maybe have another idea what could be wrong?