I’m fitting a multivariate logit response model with correlated varying intercepts using `brms`

, following this vignette. The application is modeling multiple survey responses per respondent. I am trying to understand what the exact model specification is when using the nested notation `(1 | id | group)`

.

A simplified version of the model I’m fitting looks like this (full reproducible example below):

```
form <- bf(mvbind(pres, gov) ~ (1 | id | county_fips))
out <- brm(form,
data = dat,
family = bernoulli)
```

Both outcomes (`pres`

and `gov`

) are binary and I want to include random intercepts at the county level that are correlated across outcomes.

I can’t find documentation about exactly what model `(1 | id | county_fips)`

denotes. The generated Stan code from the model above looks like this:

```
// generated with brms 2.18.0
functions {
/* compute correlated group-level effects
* Args:
* z: matrix of unscaled group-level effects
* SD: vector of standard deviation parameters
* L: cholesky factor correlation matrix
* Returns:
* matrix of scaled group-level effects
*/
matrix scale_r_cor(matrix z, vector SD, matrix L) {
// r is stored in another dimension order than z
return transpose(diag_pre_multiply(SD, L) * z);
}
}
data {
int<lower=1> N; // total number of observations
int<lower=1> N_pres; // number of observations
array[N_pres] int Y_pres; // response variable
int<lower=1> N_gov; // number of observations
array[N_gov] int Y_gov; // response variable
// data for group-level effects of ID 1
int<lower=1> N_1; // number of grouping levels
int<lower=1> M_1; // number of coefficients per level
array[N_pres] int<lower=1> J_1_pres; // grouping indicator per observation
array[N_gov] int<lower=1> J_1_gov; // grouping indicator per observation
// group-level predictor values
vector[N_pres] Z_1_pres_1;
vector[N_gov] Z_1_gov_2;
int<lower=1> NC_1; // number of group-level correlations
int prior_only; // should the likelihood be ignored?
}
transformed data {
}
parameters {
real Intercept_pres; // temporary intercept for centered predictors
real Intercept_gov; // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1; // group-level standard deviations
matrix[M_1, N_1] z_1; // standardized group-level effects
cholesky_factor_corr[M_1] L_1; // cholesky factor of correlation matrix
}
transformed parameters {
matrix[N_1, M_1] r_1; // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_pres_1;
vector[N_1] r_1_gov_2;
real lprior = 0; // prior contributions to the log posterior
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_pres_1 = r_1[ : , 1];
r_1_gov_2 = r_1[ : , 2];
lprior += student_t_lpdf(Intercept_pres | 3, 0, 2.5);
lprior += student_t_lpdf(Intercept_gov | 3, 0, 2.5);
lprior += student_t_lpdf(sd_1 | 3, 0, 2.5)
- 2 * student_t_lccdf(0 | 3, 0, 2.5);
lprior += lkj_corr_cholesky_lpdf(L_1 | 1);
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N_pres] mu_pres = rep_vector(0.0, N_pres);
// initialize linear predictor term
vector[N_gov] mu_gov = rep_vector(0.0, N_gov);
mu_pres += Intercept_pres;
mu_gov += Intercept_gov;
for (n in 1 : N_pres) {
// add more terms to the linear predictor
mu_pres[n] += r_1_pres_1[J_1_pres[n]] * Z_1_pres_1[n];
}
for (n in 1 : N_gov) {
// add more terms to the linear predictor
mu_gov[n] += r_1_gov_2[J_1_gov[n]] * Z_1_gov_2[n];
}
target += bernoulli_logit_lpmf(Y_pres | mu_pres);
target += bernoulli_logit_lpmf(Y_gov | mu_gov);
}
// priors including constants
target += lprior;
target += std_normal_lpdf(to_vector(z_1));
}
generated quantities {
// actual population-level intercept
real b_pres_Intercept = Intercept_pres;
// actual population-level intercept
real b_gov_Intercept = Intercept_gov;
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1, upper=1>[NC_1] cor_1;
// extract upper diagonal of correlation matrix
for (k in 1 : M_1) {
for (j in 1 : (k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
}
```

It’s a bit tough for me to fully parse this model given the non-centered parameterizations, Cholesky factorizations, etc. But here’s my best interpretation:

Let i index observations, j = (1, \dots, J) index outcome variables, and let c[i] denote the county of respondent i. The model is given by

\begin{align} y_{ij} &\sim \text{Bernoulli}(\theta_{ij}) \\ \theta_{ij} &= \delta_j + \alpha_{c[i],j} \\ \alpha_c &\sim \text{MVN}(0, \text{diag}(\sigma_1, \dots, \sigma_J)'\,\, \Omega \,\,\text{diag}(\sigma_1, \dots, \sigma_J) \\ \sigma_j &\sim \text{Student-}t(3, 0, 2.5) \\ \Omega &\sim \text{LKJ}(1) \end{align}

Here \alpha_c is length-J vector of random intercepts across outcomes for county c, \Omega is a J \times J correlation matrix and the \sigma_j's are the standard deviations of the county-level random effects.

Given that setup, I have a few questions: First, most basically, am I interpreting the code correctly – is this indeed the model that is being estimated?

Second, if the model I wrote down is basically correct, I do not see any modeled correlation *between* counties in the random intercepts. To be more precise, let \alpha^j denote the vector of random intercepts for outcome j across all counties. Then we have \alpha^{j} \sim \text{MultivariateNormal}(0, \text{diag}(\sigma_j)) – i.e. the covariance matrix is diagonal. Is that correct?

Finally, is there a way to extend the model in `brms`

to allow for correlated random effects *both* across outcomes and across counties (for the same outcome)? It seems like one possibility would be to stack the dataset, duplicating rows J times (once for each outcome), then estimating a model like this:

```
bf(y ~ (1 | county) + (1 | outcome)
```

I believe this would include a county-level effect that is constant across all outcomes and an outcome-level effect that is constant across counties. Is there a reason not to take this “stacked” approach and to use `mvind()`

instead? The stacked approach seems more flexible as you could easily include additional effects for interactions `(1 | county : outcome)`

or survey respondents `(1 | respid)`

. But I might be totally misunderstanding something about the nested grouping model!

Thank you – this is the first time I’ve used `brms`

for a full project and it has cut down my coding time immensely.

Reproducible example:

```
library(brms)
library(dataverse)
library(tidyverse)
Sys.setenv("DATAVERSE_SERVER" = "dataverse.harvard.edu")
# Download CCES data
dat <- get_dataframe_by_name("cumulative_2006-2022.rds",
dataset = "10.7910/DVN/II2DB6",
original = TRUE,
.f = readRDS)
dat <- dat %>%
filter(year == 2020,
state == "North Carolina") %>%
mutate(across(c(voted_pres_party, voted_gov_party),
~ case_when(.x == "Democratic" ~ 1L,
.x == "Republican" ~ 0L,
TRUE ~ NA_integer_))) %>%
rename_with(.cols = c(voted_pres_party, voted_gov_party),
.fn = ~ gsub("voted\\_|\\_party", "", .x)) %>%
drop_na(gov, pres, county_fips)
# Fit model
form <- bf(mvbind(pres, gov) ~ (1 | id | county_fips))
out <- brm(form,
data = dat,
family = bernoulli,
chains = 2,
cores = 2,
iter = 1000,
backend = "cmdstanr")
```