What does non-centered parameterization actually do? How to interpret model? (brms)

Short summary of the problem
Hi everyone, this is maybe a silly and/or very basic question to those more familiar with statistics and modelling however I’m quite the newbie and I’m feeling very out of my depth.

I ran a model in brms (see below) with a normal intercept (i.e., 1+ on the right side) but it failed to converge and the error message suggested I try non-centered parameterization. After a google search I learned from a post somewhere that you can do so through 0+ (as is now included below) and after that the model ran successfully. However, I have no idea what that actually does/means and how it impacts interpretation of results - or if in fact it even made sense to do for the model I was running. When I google about non-centered parameterization in brms it seems to say everywhere that brms does this automatically which is confusing me further - if it’s done automatically then why did the error message suggest that and why did it run successfully after the change?

Honestly this is the first model I’ve written and it was based on a model someone else wrote so I’m quite overwhelmed with trying to understand it. I believe it is a non-linear mixed effect model? Is that even the correct description? Is it hierarchical?

Sorry for the stream of silly questions - any help anyone is willing to provide would be amazing. I can also provide any other information about the model/dataset/my questions you may want or need at your request.

Thanks so much!

code_to_run_your_model(if_applicable)

m4_cor ← brm(bf(logIVI ~ 0 + chicks + chickage + (1 + chicks + chickage | year) + (1 | site) + (1|yearsite),
sigma ~ year),
data = dw,
prior = c(set_prior(“normal(0, 0.5)”, class = “b”, coef = “chickage”),
set_prior(“normal(-0.1, 0.5)”, class = “b”, coef = “chicks”),
set_prior(“lkj(2)”, class = “cor”),
set_prior(“cauchy(0,1)”, class = “sd”)),
warmup = 1000,
iter = 5000,
chains = 3,
cores = 2,
control = list(adapt_delta = 0.99, max_treedepth = 15),
backend = “cmdstanr”,
threads = threading(2))

1 Like

The non-centered parameterization arises when trying to fit hierarchal models. For an in-depth discussion of what a hierarchal model is, what centered and non-centered parameterization are, and why they are so important for accurate computation see Hierarchical Modeling.

“Mixed models” utilize multiple hierarchical models in a particular way. They can be formalized as hierarchical overlapping factor models, see https://betanalpha.github.io/assets/case_studies/factor_modeling.html for an extensive discussion.

Although the interfaces in tools like brms and many others make it easy to specify certain kinds of “mixed models” there also hide a tremendous amount of assumptions behind these models, many of which affect computational performance. Consequently learning about concepts like “non-centered parameterizations” can appear overwhelming and will likely end up a pretty long journey (see for example how long the two links above are); on the plus side that journey will also help you understand these models and how to use them well in practice.

3 Likes

The non-centered parameterization is already implemented in brms. And in any case, using 0 + in place of 1 + would not do it. In brms, when you specify 1 + for the intercept, you are specifying the population intercept as follows (quoting from the manual): “By default, the population-level intercept (if incorporated) is estimated separately and not as part of population-level parameter vector b. As a result, priors on the intercept also have to be specified separately. Furthermore, to increase sampling efficiency, the population-level design matrix X is centered around its column means X_means if the intercept is incorporated.” If you specify 0 + Intercept + in your model, then that behavior is avoided and you can specify prior on the real intercept. This has nothing to do with non-centered parameterization as described for hierarchical models.
However, when you simply write 0 + and not 0 + Intercept +, then actually you are specifying the model to not estimate a population-level intercept at all. You can show yourself this using the following code:

x <- rnorm(100,2,1)
y <- 5 + 3*x + rnorm(100,0,1)
d1 <- cbind(y,x)
d1 <- data.frame(d1)
m1 <- brm(y ~ 1 + x, data=d1, cores=4)
m2 <- brm(y ~ 0 + Intercept + x, data=d1, cores=4)
m3 <- brm(y ~ 0 + x, data=d1, cores=4)
m1
m2
m3
plot(conditional_effects(m1), points=T)
plot(conditional_effects(m2), points=T)
plot(conditional_effects(m3), points=T)

If you run this, notice that the results of m1 and m2 are equivalent and recover the parameters of the simulated data. However, m3 has no intercept estimated, and the parameter for the effect of x on y is well off. From the plots you can see that m1 and m2 are the same, while m3 has an intercept at zero.

No, this is not a non-linear model, but it is a mixed-effect or hierarchical model.
From your code it looks like you estimated population-level effects for chicks and chickage but no population-level intercept, and you estimate group-level intercepts and effects for chicks and chickage for each year. You also have group-level intercepts for site and yearsite.

I realize this is an old post and the response is good, but I thought I might mention something about the intercept in brms, as it seems there was some confusion (when the OP did a Google search) about non-centered parameterization as it pertains to hierarchical models and the intercept of the centered design matrix in brms.

3 Likes