Define a multilevel model with hierarchical/adaptive priors in brms


#1

I appreciate it if someone helps me to define the following multilevel model with hierarchical (adaptive) priors in brms.

The model is a binary logistic regression where intercept and coefficients are grouped by the “grp” variable:

    y ~ Bernoulli(p)
    logit(p) = intercept[grp] + b_1[grp] * x_1 + b_2[grp] * x_2

    intercept ~ Normal(mu_0, sigma_0)
    mu_0 ~ Normal(0, 1)
    sigma_0 ~ HalfCauchy(0, 1)

    b_1 ~ Normal(mu_1, sigma_1)
    mu_1 ~ Normal(0, 1)
    sigma_1 ~ HalfCauchy(0, 1)

    b_2 ~ Normal(mu_1, sigma_1)
    mu_2 ~ Normal(0, 1)
    sigma_2 ~ HalfCauchy(0, 1)

brms code I have so far:

brm(formula = y | trials(1) ~ 1 + (1 | grp) + (0 + x_1 | grp) + (0 + x_2 | grp)
family = binomial,
prior = c(prior(normal(0, 1), class = Intercept),
                prior(cauchy(0, 1), class = sd)),
data = dat,
iter = 5000, warmup = 1000, chains = 4, cores = 4, control = list(adapt_delta = 0.98))

Note: I intentionally used binomial family instead of bernoulli, but don’t remember the reason!

1. Is the brms code defines the model?
2. How do I specify the priors for mu_1 and mu_2?
3. I guess I should include population-level effects too. Can missing these effects in the model be the reason for the long running time? (i.e., 1 hour, 180 samples, 100 variables, grp variable has 11 distinct values)

Thank you,

  • Operating System: Windows 10
  • brms Version: 2.7.0

#2

Hi, there is a bunch of minor things:

  • First note, that you can inspect the Stan code generated by brms with make_stancode to make sure the model matches what you expect
  • Your brms formula contains overall (population-level) intercept, but your model specification does not
  • You can use multiple terms with one grouping - (1 | grp) + (0 + x_1 | grp) + (0 + x_2 | grp) should be the same as (1 + x_1 + x_2 | grp)
  • The | operator implicitly models correlations between groups, to have the model without correlations (which is what you have in your definition above), use 1 + x_1 + x_2 || grp
  • Brms does not (at least not by default) model the mean of group-level predictors (your mu_0, mu_1, mu_2) and assumes it is always 0. This is because those can be absorbed into the population-level intercept.

Finally, remember the folk theorem - if it runs slow, it is almost certainly a problem with the model. Missing population-level effects can be one of the problems. Uncentered predictors could also be an issue.


#3

Agreed to all what Martin said. Taken together, I think the correct formula would be

y | trials(1) ~ 1 + x_1 + x_2 + (1 + x_1 + x_2 || grp)

#4

Thank you for your responses @paul.buerkner and @martinmodrak . I think I’m now more clear about the model specification in brms. I appreciate your help.