Brms approach to Stan models for nested versus non-nested multi-level models?

ssp3nc3r · June 15, 2019, 3:54pm

brms underlying Stan code for a non-nested multi-level model, e.g.,

make_stancode(y ~ 1 + (1 | re_1) + (1 | re_2) + (1 | re_3), data = d)

is identical to the Stan code for a nested model,

make_stancode(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)

whether or not the nesting is natural. It looks like the difference between brms-generated Stan models are in how they index the random effects. If naturally nested, it seems each random effect assumes the indexing of the level before it. Is that right? And if not-naturally nested, how does brms approach indexing?

Here’s some code:

# Simulated data
N    <- 100
y    <- rnorm(N)

# naturally nested random effects
re_1 <- sample(LETTERS[1:5], N, replace = T)
re_2 <- paste0(re_1, sample(seq(5), N, replace = T))
re_3 <- paste0(re_2, sample(seq(5), N, replace = T))
d    <- data.frame(y, re_1, re_2, re_3)

# create stan code and data for nonnested and nested models 
# using naturally nested random effects

library(brms)
nonnested_code <- make_stancode(y ~ 1 + (1 | re_1) + (1 | re_2) + (1 | re_3), data = d)
nonnested_data <- make_standata(y ~ 1 + (1 | re_1) + (1 | re_2) + (1 | re_3), data = d)
nested_code    <- make_stancode(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)
nested_data    <- make_standata(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)

# test for equality in code and data
identical(nonnested_code, nested_code) # TRUE
identical(nonnested_data, nested_data) # TRUE

# not naturally nested random effects
re_2 <- paste0(sample(seq(5), N, replace = T))
re_3 <- paste0(sample(seq(5), N, replace = T))
d    <- data.frame(y, re_1, re_2, re_3)

# create stan code and data for nonnested and nested models 
# using naturally nested random effects

nonnested_code <- make_stancode(y ~ 1 + (1 | re_1) + (1 | re_2) + (1 | re_3), data = d)
nonnested_data <- make_standata(y ~ 1 + (1 | re_1) + (1 | re_2) + (1 | re_3), data = d)
nested_code    <- make_stancode(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)
nested_data    <- make_standata(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)

# test for equality in code and data
identical(nonnested_code, nested_code) # TRUE
identical(nonnested_data, nested_data) # FALSE

writeLines(nested_code) # look at Stan code
str(nonnested_data)
str(nested_data)

paul.buerkner · June 15, 2019, 6:50pm

(1 | A/B) translates to (1 | A) + (1 | A:B) where A:B simply means creating a new grouping factor with the levels of A and B pasted together. This is explained a little bit longer in https://journal.r-project.org/archive/2018/RJ-2018-017/index.html

ssp3nc3r · June 15, 2019, 8:27pm

Thanks. That’s helpful. In looking at the paper, I do have a follow-up question on the paper description,

The / operator indicates nested grouping structures and expands one grouping factor into two or more when using multiple / within one term. For instance, if the user were to write (1 | g1/g2), brms will expand this to (1 | g1) + (1 | g1:g2). Instead of |, users may use || in grouping terms to prevent group-level correlations from being modeled.

In reviewing underlying Stan code and data from my example, I want to make sure I understand how the group-level correlations are being modeled, and when I compare:

nested_code   <- make_stancode(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)
nested_data   <- make_standata(y ~ 1 + (1 | re_1 / re_2 / re_3), data = d)

nested_codenc <- make_stancode(y ~ 1 + (1 || re_1 / re_2 / re_3), data = d)
nested_datanc <- make_standata(y ~ 1 + (1 || re_1 / re_2 / re_3), data = d)

# test for equality in code and data
identical(nested_code, nested_codenc) # TRUE
identical(nested_data, nested_datanc) # TRUE

The model (stan code and stan data) using | seems identical to the model using ||

paul.buerkner · June 15, 2019, 8:37pm

That’s because there is nothing to correlate in your model. Replace (1 | ...) by (1 + x | ...) and you will see the difference.

ssp3nc3r · June 15, 2019, 8:51pm

Ah, that makes sense. I misunderstood the sentence as meaning correlation across groups instead of correlation of predictors within each group.

I’d like to have the group (1 | g1) provide/share information for the nested group (1 | g1:g2).

paul.buerkner · June 15, 2019, 8:55pm

Then you need to make a conceptual proposal of how this should look like :-)

MegBallard · June 19, 2019, 12:48am

Learning Bayesian methods and Stan coding is difficult and something I don’t know a lot about.

Using and interpreting mulitlevel/mixed/hierarchical models is also hard, much harder than most people realize.

Check out general help on specifying multilevel models.

Ben Bolker’s FAQ
*This oft cited answer in this stack exchange thread
*Doug Bates article about mixed models in lme4 (he also has a lot of useful slide shows and lecture notes)
*There is a large archive of questions in archived mailing list [r-sig-mixed-models] archive

Obviously these are not bayesian methods but the formula syntax is largely the same except where explicitly cited in the brms manual.

Nested levels should always be explicitly coded so any specification you use is at least not going to be messed up by that one step.

ssp3nc3r · June 20, 2019, 10:05pm

Would:

Be a start?

paul.buerkner · June 20, 2019, 10:09pm

I don’t see how this can help since the model you linked to does not contain any correlation or similar measure to share information between random effects of different levels. Or am I overlooking something?

ssp3nc3r · June 23, 2019, 3:31am

As a start, it seems it may be useful to just change the priors from independent groups to using the broader group as the mean for the nested group; same for the standard deviations. So, the brms-generated Stan code in the above example,

target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
target += normal_lpdf(z_1[1] | 0, 1);

target += student_t_lpdf(sd_2 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
target += normal_lpdf(z_2[1] | 0, 1);

target += student_t_lpdf(sd_3 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
target += normal_lpdf(z_3[1] | 0, 1);

Could be changed to,

target += student_t_lpdf(sd_1 | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
target += normal_lpdf(z_1[1] | 0, 1);
  
target += student_t_lpdf(sd_2 | 3, sd_1, 10)
    - 1 * student_t_lccdf(0 | 3, sd_1, 10);
target += normal_lpdf(z_2[1][J_2] | z_1[1][J_1], 1);
  
target += student_t_lpdf(sd_3 | 3, sd_2, 10)
    - 1 * student_t_lccdf(0 | 3, sd_2, 10);
target += normal_lpdf(z_3[1][J_3] | z_2[1][J_2], 1);

paul.buerkner · June 23, 2019, 9:55am

For me to seriously consider a proposal, we cannot just change a bunch of lines of Stan code and hope for the model to be still reasonable. Instead it is important to have a clear theory of what is the target, how the desired model looks like, what properties it has and why those makes sense.

With respect to what you are writing above it seems that, at least party, you want to use a centered parameterization (of the random effects), while brms uses the non-centered parameterization throughout. The latter has the advanatage of also applying to non-nested models and to usually having better sampling properties. Apart from that, if both parameterizations exist, they are equivalent.

ssp3nc3r · June 23, 2019, 2:44pm

Thank you for your explanations, I understand more fully what is intended with the brms-generated code. I didn’t mean to imply I’d like you to rework what brms does for me. It’s a really helpful tool, and like that I can even use its code to pick up where it left off on the occasions when I’d like a bit different expression.

I just included the code to show one way I think the information could be minimally shared between the groups, independent of whether a centered parameterization was used.

Topic		Replies	Views
Understanding brms nested grouping term in multivariate response models brms hierarchical-model , multivariate-normal	4	1030	December 5, 2024
Indexing approach to predictors brms	21	4646	February 8, 2023
Translating/comparing brms parameter estimates to rstan estimates brms rstan	9	1025	May 12, 2020
Brms: non-centered to centered parameterization brms	8	849	October 3, 2023
Individuals nested within groups brms	2	693	August 20, 2021

Brms approach to Stan models for nested versus non-nested multi-level models?

Related topics