Divergent transitions and exceed max treedepth warning

Hi I was running 3000 iterations with 3 chains for my rstan model and I got the following warning regarding the divergent transitions:

1: There were 2808 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. 
2: Examine the pairs() plot to diagnose sampling problems
 
3: The largest R-hat is 1.09, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 

so I adjusted the adapt_delta, which is the acceptance rate/probability from default to 0.99 and ran 4000 iterations with 3 chains, and then I got the following warning:

Warning messages:
1: There were 1049 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. 
2: There were 4951 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
3: Examine the pairs() plot to diagnose sampling problems
 
4: The largest R-hat is 1.22, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
5: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
6: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 

Questions:

  1. is high acceptance probability a problem in HMC/rstan? I know in MH algorithm high acceptance probability is not a good thing.
  2. for the second warning, I need to first adjust the max_treedepth to maybe 15 to get rid of the max_treedepth warning and also maybe increase the adapt_delta from 0.9 to 0.9999 if high acceptance probability is not a problem to get rid of the 1049 divergent transitions, right?

Thanks!

1 Like

Raising adapt_delta or max_treedepth when you have so many transitions is rarely the solution. You may want to have a look at Divergent transitions - a primer to diagnose the underlying problem in your model. If you want someone to help you in that task, post your model here.

2 Likes

Perhaps @maxbiostat or @Max_Mantei have better experience with this kind of models.

1 Like

Hi, I’m not very familiar with this kind of model, but there are two things that you can do to make it easier for us to help you.

First, can you please explain (for instance, in Math formulas) what your model is supposed to do? What do the variables mean and how are they related?
Second, can you post some code that simulates fake data as input for this model, so that we can reproduce the problems on our end?

There are also some lines of code that seem suspicious at first glance. You have

(a-0.25)/(3-0.25) ~ beta(1.5,1.5);
(b+3)/(3+3) ~ beta(2,2);
(t+3)/(1+3) ~ beta(2,2);

but the parameters a, b, t are not declared with an upper bound, and b and t also don’t have a lower bound. The Beta distribution has support on [0, 1] so this should probably be changed. For example the first line of code implies that 0 \leq(a-0.25)/(3-0.25) \leq 1, so 0.25 \leq a \leq 3. You should declare parameter a with these bounds (i.e. vector<lower=0.25, upper=3>[n_item] a;).

Also, is there a specific reason why you can’t constrain these parameters just to 0 and 1 so you don’t have to perform these transformations?

Anyway, I hope this helps!

4 Likes

Hi Thanks for the reply!

I used to have lower and upper bounds for the parameters but may not be the same as derived by the beta distribution [0,1], but I got initialization error, so I deleted the bounds and my model runs successfully.

I want to specify four parameter beta distributions for my parameters, but there is no four parameter beta distribution function in Rstan, so I did the tranformation to beta distribution.

I see. I’m not familiar with the four-parameter Beta distribution. Looking briefly at https://en.wikipedia.org/wiki/Beta_distribution#Four_parameters I get the impression that it is simply a linear transformation of a ‘standard’ Beta variable. Is that correct? If so, it should still have lower and upper bounds.

Still, I believe it’s more important to explain what the model is supposed to be doing. Also please generate some fake data so your problem can be reproduced. For me, the problem is that your model is too big to fully understand what it’s doing without some explanation :-)

4 Likes