Struggling with priors and standardization

For the first dataset, here is the model I have been trying to fit with brms:

y_{ij} \sim N(a_0 + a_1 x_i + b_{0i} + b_{1i} x _i+ c_j, s^2)
b_{ki} \sim N(0, s_{ki}^2), k = 0,1
c_j \sim N(0, s_2^2)

With

mean(y) = 73, sd(y) = 556

if I use the defaults priors in brms (improper uniform distribution for the population effects and half-Student(3) for the scale parameters, it took about 13 minutes with 4 chains and 1000 iterations. However, if I standardize the response variable y, convergence was fine, but the runtime was 3.5 hours with the same chains and iterations! My questions are:

(1) I have trouble properly adjusting the results in a hierarchical model due to centering and scaling from the standardization. Making adjustment at the random draw level or population/group effects level?

(2) Is it really a bad idea to standardize the response variable y? I got mixed information when reading the discussions from various sources. Why was the runtime dramatically worse after the standardization of the response variable (13 minutes vs. 3.5 hours)?

For the second dataset, the model is a little simpler:

y_{ij} \sim N(a + b_i + c_j, s^2)
b_i \sim N(0, s_1^2)
c_j \sim N(0, s_2^2)

With

mean(y) = 0.025, sd(y) = 0.265

I managed to get the model converged with the default priors, 8 chains and 2000 iterations. However, the results do not make sense because all the posterior distributions for

a + b_i

were pretty much the same. I noticed that the posterior standard deviation s_1 was very small: 0.0017 with a 95% quantile interval [0.00007, 0.00469]. So, I suspect the problem is the default priors. So, my third question is

(3) How to properly set priors in this context of hierarchical model?

Hey, here are some sources on how to find to find the right priors that I found that might help you as well:

From what I understand, it essentially comes down to you plotting the combined prior and trying out different values so prevent too much weight on extreme values but still allow the sampling to explore the parameter space properly.
As far as I understand, this is essentially a trial and error thing combined with some experience.

I don’t think I have enough experience to help with the other two questions so I’d rather not add my thoughts and let someone with more experience do that instead.

And discourse does support math mode:

$y_{ij} \sim N(a + b_i +c_j,  s²)$

turns into:
y_{ij} \sim N(a + b_i +c_j, s²)

2 Likes