Varying highly correlated slopes: between-group variation collapses to zero, poor convergence

I have a regression model with 32 predictors and RW prior on the slopes (more details below) that works well and add varying intercepts-- it works well.

But when I let slopes vary (group slopes = global RW plus group-specific RW, see below), between-group variability samples poorly (R hat \approx 2) and gets stuck at almost exactly zero (10^{-5} after standardizing predictors and outcome). I have J=9 groups at the moment with about 20 observations each. I tried using a non-centered parameterization; GP instead of RW; removing random intercepts in case correlation with random slopes was the problem. None of these changed the basic problem.

I guess this stems from high correlation between predictors so the slopes are poorly identified. How can I get the model to explore the possibilities?

Global model:

Standard regression with one big exception… I start with:

y = \alpha + X\beta + \epsilon

where X is n observations \times k predictors. Here k is 32 and the predictors are ordered for each i, X_{i,k} is an approximately continuous signal. Thus I believe the coefficients \beta_1,...,\beta_{32} should be roughly continuous. Thus I have random-walk prior (I also tried GP and get almost identical results but RW samples more efficiently):

\beta_k \sim RW(k, \sigma_{rw})

where \sigma_{rw} is the SD of the RW steps. This model fits well and performs well!

Multilevel model:

I start by letting \alpha_{j} vary by group (9 groups). That works great.

Then I let the group-specific slopes \beta_{k,j} be the sum of the global RW and a RW for each j:

\beta_{k,j} = \beta_k + \beta'_{k,j}

where \beta_k \sim RW(k, \sigma_{rw}) as before and \beta'_{k,j} \sim GP(k, \sigma'_{rw}) for each k.

It’s not completely clear (to me) what is the difference here, is this increasing the number of slopes?

Does this mean that you have 32 difference predictors for each group? The number of observations seems small either way, so maybe there’s not enough information to distinguish the slopes and results are poor.

Very possible, but I’m not clear on all the details to be able make a better assessment. In any case, my suggestion is to make a smaller version of the model, with fewer predictors, maybe a couple of groups, and if needed less data (or aggregating “similar” groups). That should give you an idea of where things go wrong.