Joint model in brms with outcomes at different levels & shared random effects

Hi everyone - I am interested in whether I can fit a joint multilevel model, in brms, in which the responses are at different levels and the random effects from each submodel are allowed to correlate. The option to subset in brms feels like it might be possible… although I haven’t managed it yet, so just wondering if anyone might suggest something I’ve missed, or whether it isn’t currently possible.

So taking a simplified example of a 2-level bivariate outcome model in which one response (y_{1ij}, below: the outcome for occasion i (i = 1, ..., n_j) for individual j(j=1, ..., J)) is a repeatedly-measured continuous outcome within individual with a random intercept (u_{0j}) for individual in that submodel, and the other response (y_{2j}) is an individual-level continuous outcome, i.e. at level 2, with its residual error (at the individual-level; u_{1j}) allowed to correlate with the random intercept from the longitudinal submodel.

\begin{aligned}y_{1ij} & = \beta_0 + u_{0j} + e_{ij}\\[6pt] y_{2j} & = \gamma_0 + u_{1j}\\[8pt] \left(\begin{array}{l}u_{0j}\\u_{1j}\end{array}\right) & \sim \mbox{N} \left[\left(\begin{array}{l}0\\0\end{array}\right), \left(\begin{matrix}\sigma^2_{u0} & \\ \sigma_{u01} & \sigma^2_{u1}\\ \end{matrix}\right)\right]\\[6pt] e_{ij} & \sim \mbox{N}(0,\sigma^2_e)\\ \end{aligned}

So, I’ve tried: fitting a random intercept for y2 (attempt 1, below); allowing for random effects for sigma for y2, with a fixed intercept for sigma (attempt 2, below); allowing for random effects for sigma for y2, without a fixed intercept for sigma (attempt 3, below)… with no apparent success. I appreciate it may not be possible, but if anything occurs to anyone, it would be great if you could please let me know! Thanks!

E.g. simulating some data (NB: I’m using brms v.2.10.0; OS: Windows 10)

library(brms)
options(mc.cores = parallel::detectCores())
library(MASS)

set.seed(93801)

# number of individuals:
J <- 1000
# number of repeated measures per individual:
n <- 10
# total number of observations:
N <- n * J

# individual-level indicator at level 1 (observation level):
Person_ID <- rep(1:J, each = n)

# SDs of random effects, and their correlations:
sigma_u <- c(0.7, 0.4)
rho_01 <- 0.4

# number of individual-level random effects:
n_u_total <- length(sigma_u)

# generate correlation and covariance matrices for random effects:
corr_u <- matrix(, nrow = n_u_total, ncol = n_u_total)
corr_u[lower.tri(corr_u, diag = FALSE)] <- rho_01
diag(corr_u) <- rep(1, times = n_u_total)
corr_u[upper.tri(corr_u)] <- t(corr_u)[upper.tri(corr_u)]
cov_u <- diag(sigma_u) %*% corr_u %*% diag(sigma_u)

# generate random effects (at level 2):
u <- MASS::mvrnorm(n = J,
mu = rep(0, times = n_u_total),
Sigma = cov_u)

# expand out to level 1:
u_long <- u[Person_ID, ]

# generate level 1 residuals:
sigma_e <- 2.25
e <- rnorm(n = N, mean = 0, sd = sigma_e)

# coefficient values for fixed effects in mean function for y1:
beta <- 1

## generate repeatedly-measured outcome
y1 <- beta + u_long[, 1] + e

# coefficient values for fixed effects in mean function for y2:
gamma <- 1

# generate individual-level outcome:
y2 <- gamma + u[, 2]

# generate indicator to enable subsetting in call to brm
subset_indicator <- rep(FALSE, times = N)
subset_indicator[which(!duplicated(Person_ID))] <- TRUE

# combine into data frame
df <- data.frame(y1 = y1,
Person_ID = Person_ID,
y2 = y2[Person_ID],
subset_indicator = subset_indicator)

## what have I ended up with? can I get these back from model fit?
# sigma for random intercept for y1:
sd(u[, 1])
# [1] 0.7005967
# sigma for random intercept for y2:
sd(u[, 2])
# [1] 0.3974167
# correlation between random effects:
cor(u[, 1], u[, 2])
# [1] 0.4365777


Attempt 1: random intercept for individual-level outcome:

bform <-
bf(
y1 ~ 1 + (1 |s| Person_ID),
family = gaussian()
)  +
bf(
y2 | subset(subset_indicator) ~ 1 + (1 |s| Person_ID),
family = gaussian()) +
set_rescor(FALSE)
fit <- brm(bform, data = df, seed = 79634)

summary(fit)

# Problems: presumably a lack of identifiability (as estimating individual-level random effects for y2 twice (as the random intercept for y2, and as sigma)):

# The model has not converged (some Rhats are > 1.1). Do not analyse the results!
#  We recommend running more iterations and/or setting stronger priors.There were 159 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
# See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup Family: MV(gaussian, gaussian)
# Links: mu = identity; sigma = identity
# mu = identity; sigma = identity
# Formula: y1 ~ 1 + (1 | s | Person_ID)
# y2 | subset(subset_indicator) ~ 1 + (1 | s | Person_ID)
# Data: df (Number of observations: 10000)
# Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
# total post-warmup samples = 4000
#
# Group-Level Effects:
#   ~Person_ID (Number of levels: 1000)
#                                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sd(y1_Intercept)                   0.70      0.03     0.64     0.76 1.01      629      999
# sd(y2_Intercept)                   0.27      0.06     0.16     0.37 1.19       16       35
# cor(y1_Intercept,y2_Intercept)     0.66      0.16     0.43     0.98 1.15       18       40
#
# Population-Level Effects:
#              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# y1_Intercept     1.00      0.03     0.93     1.06 1.01      962     1857
# y2_Intercept     1.01      0.01     0.99     1.04 1.01      545     1542
#
# Family Specific Parameters:
#          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sigma_y1     2.25      0.02     2.21     2.28 1.01     1958     1984
# sigma_y2     0.28      0.06     0.16     0.36 1.20       16       29
#
# Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
# is a crude measure of effective sample size, and Rhat is the potential
# scale reduction factor on split chains (at convergence, Rhat = 1).


Attempt 2: random effects for sigma for individual-level outcome:

bform <-
bf(
y1 ~ 1 + (1 |s| Person_ID),
family = gaussian()
)  +
bf(
y2 | subset(subset_indicator) ~ 1,
sigma ~ (1 |s| Person_ID),
family = gaussian()) +
set_rescor(FALSE)
fit <- brm(bform, data = df, seed = 79634)
summary(fit)

# NB By default (as I didn't specify otherwise), it's fitted a fixed intercept for sigma for y2...
# exp(sigma_y2_Intercept) == exp(-0.93) == 0.39, so close to the value of
# sigma for y2 random effects the data was simulated from, above.
# However, I don't think I've specified the model I was intending:
# e.g. what are sd(sigma_y2_Intercept) and
# cor(y1_Intercept,sigma_y2_Intercept) estimating?

# Family: MV(gaussian, gaussian)
# Links: mu = identity; sigma = identity
# mu = identity; sigma = log
# Formula: y1 ~ 1 + (1 | s | Person_ID)
# y2 | subset(subset_indicator) ~ 1
# sigma ~ (1 | s | Person_ID)
# Data: df (Number of observations: 10000)
# Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
# total post-warmup samples = 4000
#
# Group-Level Effects:
#   ~Person_ID (Number of levels: 1000)
#                                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sd(y1_Intercept)                         0.70      0.03     0.64     0.76 1.00     1909     2888
# sd(sigma_y2_Intercept)                   0.07      0.04     0.00     0.17 1.00     1209     1513
# cor(y1_Intercept,sigma_y2_Intercept)    -0.45      0.38    -0.96     0.54 1.00     3951     2378
#
# Population-Level Effects:
#                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# y1_Intercept           1.00      0.03     0.94     1.06 1.00     4587     3578
# y2_Intercept           1.02      0.01     0.99     1.04 1.00     4937     2960
# sigma_y2_Intercept    -0.93      0.02    -0.98    -0.88 1.00     6144     2742
#
# Family Specific Parameters:
#          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sigma_y1     2.25      0.02     2.21     2.28 1.00     5379     3043
#
# Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
# is a crude measure of effective sample size, and Rhat is the potential
# scale reduction factor on split chains (at convergence, Rhat = 1).



Attempt 3: as above, but drop fixed intercept for sigma for y2

bform <-
bf(
y1 ~ 1 + (1 |s| Person_ID),
family = gaussian()
)  +
bf(
y2 | subset(subset_indicator) ~ 1,
sigma ~ 0 + (1 |s| Person_ID),
family = gaussian()) +
set_rescor(FALSE)
fit <- brm(bform, data = df, seed = 79634)
summary(fit)

# The estimated correlation between random effects again suggests not estimating
# the model I intended:

# Family: MV(gaussian, gaussian)
# Links: mu = identity; sigma = identity
# mu = identity; sigma = log
# Formula: y1 ~ 1 + (1 | s | Person_ID)
# y2 | subset(subset_indicator) ~ 1
# sigma ~ 0 + (1 | s | Person_ID)
# Data: df (Number of observations: 10000)
# Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
# total post-warmup samples = 4000
#
# Group-Level Effects:
#   ~Person_ID (Number of levels: 1000)
#                                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sd(y1_Intercept)                         0.71      0.03     0.65     0.77 1.00     1496     2444
# sd(sigma_y2_Intercept)                   0.91      0.06     0.80     1.02 1.00     1145     1984
# cor(y1_Intercept,sigma_y2_Intercept)    -0.26      0.13    -0.51    -0.00 1.02      365      733
#
# Population-Level Effects:
#              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# y1_Intercept     0.91      0.05     0.80     1.02 1.01      469      936
# y2_Intercept     1.04      0.02     1.00     1.07 1.00     1751     2454
#
# Family Specific Parameters:
#          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sigma_y1     2.25      0.02     2.21     2.28 1.00     6599     3074
#
# Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
# is a crude measure of effective sample size, and Rhat is the potential
# scale reduction factor on split chains (at convergence, Rhat = 1).

## NB random effects for sigma for y2 are on the log scale
my_post_samples_y2_RE <- posterior_samples(fit, "^r_Person_ID__sigma_y2.")
my_post_samples_y1_RE <- posterior_samples(fit, "^r_Person_ID__y1.")
exp_my_post_samples_y2_RE <- exp(my_post_samples_y2_RE)
post_summary_y1_RE <- posterior_summary(my_post_samples_y1_RE)
post_summary_exp_y2_RE <- posterior_summary(my_post_samples_y2_RE)
sd(post_summary_exp_y2_RE[, "Estimate"])
# [1] 0.2962514
cor(post_summary_exp_y2_RE[, "Estimate"], post_summary_y1_RE[, "Estimate"])
# [1] -0.4854978

1 Like

Hi,
sorry, don’t have time to answer now, but maybe @ucfagls or @jonah can weigh in?

Best of luck with your model!

Ok, sorry that didn’t work out, could @paul.buerkner weigh in?

This example is very detailed and I currently don’t have time to investigate it in detail. All I can say is that using the subset` addition argument is the right way to go and should be sufficient to specify such a model.

I would to first specify the two models separately, and get them to work separately. Then one can still try to bring them together (perhaps this approach was already tried and I have overlooked it).

1 Like

Thanks for following up, @martinmodrak, and for confirming that the subset argument is the right way to go, @paul.buerkner - I’ll investigate further and let you know of any progress…