I have the following simple hierarchical model specified in stan, and I have been testing it by varying K, the number of groups in my data. When tested for K = 1 (group 1) and K = 2 (group 1, 2), fitting works fine. However, when K = 3 (using groups 1 2, 3 from my data), I encounter a variety of issues, such as divergent transitions, high Rhat, etc. What is really interesting is that when at K = 3, the distribution of the posterior for group 1 and group 2 are visually very different from the distribution of the posterior for group 3. This is especially remarkable given that that the distribution of group 1 looks the same when tested for K = 1 and K = 2.

Iâ€™m unsure of what the cause for this might be. Adding additional groups should not change the model estimation process for earlier groups, given the model that I have specified below. If there is any other information that would be helpful for me to provide, please let me know!

data {
int<lower=0> N;
vector[N] alpha_hat_jdy;
vector<lower = 0>[N] sigma_alpha;
int g[N]; //group assignment
int<lower=0> K; //number of groups
}
//
parameters {
vector<lower = 0>[N] alpha_jdy;
vector<lower = 0>[K] k_my;
vector<lower = 0>[K] theta_my;
}
//
model {
alpha_hat_jdy ~ normal(-alpha_jdy, sigma_alpha);
alpha_jdy ~ gamma(k_my[g], theta_my[g]);
}

Hi, a few points that stand out that might contribute to your trouble:

the group-level parameters of your gamma distribution currently have improper priors (equal probs for all positive real values); I strongly recommend you specify at least a weakly informative prior for each

because group-level parameters donâ€™t have a prior, no pooling is happening, so Iâ€™m unsure whatâ€™s hierarchical about this model; for a hierarchical model, I would expect a parameter for the overall population mean, a variance-like parameter for the variation between groups; from the latter two, construct the shape and rate parameters of your gamma distribution for group-level means.

the group-level means seem to be restricted to be strictly negative (negated gamma-distributed variates), whereas the data alpha_hat_jdy does not seem to be limited to negative values; for further help, it would be good to understand what you are trying to model and why this way

Thank you for your suggestions! On your last point - I want to restrict alpha_jdy to be negative because Iâ€™m trying to impose shrinkage methods on the slopes of some demand curves, and economic theory says that demand curves must be negative.

On your note about priors - how would I go about specifying a weakly informative prior for the parameters of my gamma distribution.