I am trying to use non-centred parameterisation on a hierarchical model. My problem is that I’m not sure how to deal with boundaries for the parameter of the individual subject that is drawn from the group level. Let’s assume I want the parameter for my individual subject to be larger than 0. In the centered parameterisation I would code:
parameters{
real<lower=0> p_group_mu; // group level parameter in my model that only makes sense >0
real<lower=0> p_group_sd; // group level standard deviation
real<lower=0> p_sub[nsub]; //individual subject parameter
}
model{
p_sub ~ normal(p_group_mu,p_group_sd);
}
In the non-centred parameterisation, I am stuck with what boundaries to set for p_sub_raw or how otherwise to have the boundaries on p_sub as before:
but doesn’t it make a difference whether I say:
p_sub ~ normal(p_group_mu, p_group_sd)
vs.
p_sub_raw ~ normal(p_group_mu, p_group_sd)
and p_sub = exp(p_sub_raw)
it seems to me that i’'m making a different assumption about how the parameter follows a normal distribution from the group level? or does it internally become the same?
Yes it does make a difference and if you don’t have intuition for it yet it’s great to work out by simulation what the implied prior is on p_sub in the second case. That’s also an important process to go through when you have multiple levels to the hierarchy.
No worries. It helps to have cached solutions for most of these problems! I’d rather have users write to us than bang their heads on problems. Bang your head just long enough that you’ll appreciate the relief a solution brings :-)
@Bob_Carpenter this great thread recently was referenced from another thread; we have the same issue but I’m not sure this solution actually works in all cases.
Do you know a “cached solution” that also works, without the explicit Exp() trick which changes the meaning of the prior as @Jacquie mentions, in the case where “complete pooling” is the solution or someone happens to pass in only 1 subject, where in either case, p_group_sd will go to zero? If the effectively positive constraint is in a parameter lower as -mu/sd as you suggest, does that somehow avoid the divergences that would arise in the centered version if sd goes to zero?