Divergent Transitions when Scaling Hierarchical Model

liaochris · October 15, 2022, 12:49am

I have the following simple hierarchical model specified in stan, and I have been testing it by varying K, the number of groups in my data. When tested for K = 1 (group 1) and K = 2 (group 1, 2), fitting works fine. However, when K = 3 (using groups 1 2, 3 from my data), I encounter a variety of issues, such as divergent transitions, high Rhat, etc. What is really interesting is that when at K = 3, the distribution of the posterior for group 1 and group 2 are visually very different from the distribution of the posterior for group 3. This is especially remarkable given that that the distribution of group 1 looks the same when tested for K = 1 and K = 2.

I’m unsure of what the cause for this might be. Adding additional groups should not change the model estimation process for earlier groups, given the model that I have specified below. If there is any other information that would be helpful for me to provide, please let me know!

data {
  int<lower=0> N;
  vector[N] alpha_hat_jdy;
  vector<lower = 0>[N] sigma_alpha;
  int g[N]; //group assignment
  int<lower=0> K; //number of groups
}

// 
parameters {
  vector<lower = 0>[N] alpha_jdy;
  vector<lower = 0>[K] k_my;
  vector<lower = 0>[K] theta_my;
}

//
model {
  alpha_hat_jdy ~ normal(-alpha_jdy, sigma_alpha); 
  alpha_jdy ~ gamma(k_my[g], theta_my[g]);
}

LucC · October 15, 2022, 6:06pm

Hi, a few points that stand out that might contribute to your trouble:

the group-level parameters of your gamma distribution currently have improper priors (equal probs for all positive real values); I strongly recommend you specify at least a weakly informative prior for each
because group-level parameters don’t have a prior, no pooling is happening, so I’m unsure what’s hierarchical about this model; for a hierarchical model, I would expect a parameter for the overall population mean, a variance-like parameter for the variation between groups; from the latter two, construct the shape and rate parameters of your gamma distribution for group-level means.
the group-level means seem to be restricted to be strictly negative (negated gamma-distributed variates), whereas the data alpha_hat_jdy does not seem to be limited to negative values; for further help, it would be good to understand what you are trying to model and why this way

liaochris · October 18, 2022, 3:33pm

Hi Luc,

Thank you for your suggestions! On your last point - I want to restrict alpha_jdy to be negative because I’m trying to impose shrinkage methods on the slopes of some demand curves, and economic theory says that demand curves must be negative.

On your note about priors - how would I go about specifying a weakly informative prior for the parameters of my gamma distribution.

LucC · October 18, 2022, 8:01pm

OK, if you expect your data (“slope curves”) to be negative, it is good practice to declare your data as such.

As for a specifying a prior for your gamma distribution parameters: simulate and see what seems reasonable to you (prior predictive check).

But more importantly, your model does not seem to be hierarchical. See my previous comment about what that would look like.

Topic		Replies	Views
Hierarchical Model in Stan Taking to Long Modeling specification , performance	12	1233	June 19, 2019
Divergent transitions with hierarchical model Modeling	5	735	July 15, 2019
Divergent transitions with multilevel model with 3 groups rstanarm	4	1548	April 14, 2018
Sampling from the prior - why am I seeing divergent transitions? Modeling ecology	6	2790	December 10, 2021
Divergent transitions in hierarchical model Modeling fitting-issues	26	1910	November 7, 2019

Divergent Transitions when Scaling Hierarchical Model

Related topics