Syntax/inferences for nested non-linear MLM

Hi,

I am trying to fit a nested non-linear MLM using brms. I’ve worked through much of the (helpfully detailed) documentation and vignettes to arrive at what I think the correct syntax should be, but brms is throwing an error that I can’t seem to resolve.

The model I am trying to fit has three levels, with level 1 nested in level 2, and level 2 nested in level 3 (there is a separate level 1 nesting that is also included). There are level 3 covariates that I want to incorporate only in the estimation of the level 3 parameters.

Notationally, this is a rough sketch of the model I am trying to fit:

(Approximate number of observations per level: N=300,000, J=25,000, K=90, T=25)

Ultimately, I am interested in the interactive effect of z_1 and the level-1 covariates for which I specify varying slopes (x_2 through x_4), which I can recover through the nesting which links the slopes for those covariates to z_1.

The formula I am providing to brm is :

y ~ b1x1 + b2x2 + b3x3 + b4x4 + gamma + alpha,
alpha ~ (1|alpha_groupings),
gamma ~ (1 | lev2cor | gamma_groupings) + mu_gamma,
b1 ~ 1,
b2 ~ (1 | lev2cor | gamma_groupings) + mu_b2,
b3 ~ (1 | lev2cor | gamma_groupings) + mu_b3,
b4 ~ (1 | lev2cor | gamma_groupings) + mu_b4,
mu_gamma ~ (1| lev3cor | mu_groupings) + z1,
mu_b2 ~ (1| lev3cor | mu_groupings) + z1,
mu_b3 ~ (1| lev3cor | mu_groupings) + z1,
mu_b4 ~ (1| lev3cor | mu_groupings) + z1

I keeping getting the following error: “Error: The parameter ‘mu_gamma’ is not a valid distributional or non-linear parameter. Did you forget to set ‘nl = TRUE’?”

So, at this point, I have the following three questions:

  1. To verify, does the model formula I have specified mirror the model I wrote out above in the screenshot?

  2. I’ve consulted this previous question that dealt with the same issue, and tried a few tweaks, but can’t seem to figure out how to get around this error. How should I adjust the model code?

  3. My plan for examining the interactive effect of z_1 on the level 1 covariates is to generated predicted quantities on data for new observations, and compare those quantities across different values of z_1. Because z_1 is a level 3 predictor, I think I would need to sample new level 3 parameters (and by consequence level 2 parameters) for each new value of z_1 in order for my predicted quantities to be comparable (i.e., “leave all but z_1 fixed,” though the level 2 and 3 offsets will not be technically fixed).

Looking through the documentation, I think I can do this by using the “sample_new_levels” argument in brms’ fit function; given the specified z_1, it would sample new level 2 and 3 parameters for an observation with “new” IDs for those levels (i.e., IDs not included in the model), and the resulting predicted quantities would include the effect of z_1 on the level 1 covariates.

Does that sound 1) like a reasonable way to make those inferences, and 2) feasible using brms in the way I described?

Thank you in advance for your help!

Please also provide the following information in addition to your question:

  • Operating System: macOS Mojave 10.14.4
  • brms Version: 2.9

I’m not super familiar with the brms syntax, but have you tried working with nl = TRUE?

The way you’ve specified this model makes me think you want it that way: https://cran.r-project.org/web/packages/brms/vignettes/brms_nonlinear.html

The explicit parameter/data multiplication here is what you do for non-linear models in brms I think:

b1 * x1 + b2 * x2 + b3 * x3 + b4 * x4

In the nl = FALSE case, you’d just list your covariates there with the appropriate groupings. Something like:

x1 + (x2 | lev2cor | gamma_groupings) + (x3 | lev2cor | gamma_groupings) + (x4 | lev2cor | gamma_groupings)

I think (but I’m not sure).

Yes, I have nl set to TRUE. I realized that some of the formula did not render the way I wanted it to, so to repost, the formula I am providing is:

brm(bf(y ~ b1 * x1 + b2 * x2 + b3 * x3 + b4 * x4 + gamma + alpha,
alpha~ 1 + (1|alpha_groupings),
gamma ~ (1 | lev2cor | gamma_groupings) + mu_gamma,
b1 ~ 1,
b2 ~ (1 | lev2cor | gamma_groupings) + mu_b2,
b3 ~ (1 | lev2cor | gamma_groupings) + mu_b3,
b4 ~ (1 | lev2cor | gamma_groupings) + mu_b4,
mu_gamma ~ (1| lev3cor | mu_groupings) + z1,
mu_b2 ~ (1| lev3cor | mu_groupings) + z1,
mu_b3 ~ (1| lev3cor | mu_groupings) + z1,
mu_b4 ~ (1| lev3cor | mu_groupings) + z1
nl=TRUE),
data= data, family=bernoulli(link = “logit”),
prior=priors,
iter = 2000, chains = 4, cores=4, refresh = 10, seed = 13)

I will note that I respecified this as a two-level model, as follows, and it does run (though it has convergence issues I need to address):

brm(bf(y ~ b1 * x1 + b2 * x2 + b3 * x3 + b4 * x4 + gamma + alpha,
alpha~ 1 + (1|alpha_groupings),
gamma ~ (1 | lev2cor | gamma_groupings) + z1,
b1 ~ 1,
b2 ~ (1 | lev2cor | gamma_groupings) + z1,
b3 ~ (1 | lev2cor | gamma_groupings) + z1,
b4 ~ (1 | lev2cor | gamma_groupings) + z1,
nl=TRUE),
data= data, family=bernoulli(link = “logit”),
prior=priors,
iter = 2000, chains = 4, cores=4, refresh = 10, seed = 13)

I’m not sure why adding that third level causes issues; I thought that maybe I needed to explicitly specify a parameter for each occurrence of z1, but given that this two-level model runs without that, that does not seem to be the issue.

Edit: I am wrong, look at Paul’s post: Syntax/inferences for nested non-linear MLM

Ah, I’m in over my head then. I played around with this a bit though. Maybe try breaking things in multiple formulas:

f1 = bf(y ~ b1 * x1 + b2 * x2 + b3 * x3 + b4 * x4 + gamma + alpha,
       alpha~ 1 + (1|alpha_groupings),
       gamma ~ (1 | lev2cor | gamma_groupings) + mugamma,
       b1 ~ 1,
       b2 ~ (1 | lev2cor | gamma_groupings) + mub2,
       b3 ~ (1 | lev2cor | gamma_groupings) + mub3,
       b4 ~ (1 | lev2cor | gamma_groupings) + mub4,
       nl=TRUE)

f2 = bf(mugamma ~ (1| lev3cor | mu_groupings) + z1,
         mub2 ~ (1| lev3cor | mu_groupings) + z1,
         mub3 ~ (1| lev3cor | mu_groupings) + z1,
         mub4 ~ (1| lev3cor | mu_groupings) + z1,
         nl = TRUE)

brm(f1 + f2,
    data= data, family=bernoulli(link = "logit"),
    prior=priors,
    iter = 2000, chains = 4, cores=4, refresh = 10, seed = 13)

That at least doesn’t throw errors for me (though I’m not passing in any data). My R made me remove the underscores from variable names.

@paul.buerkner is this right?

Breaking it into multiple formulas the way you did would imply a multivariate model, which we don’t have here. But if we just put the formulas in f1 and f2 into the same bf call, it should be close to working. The only thing that needs to be fixed is that a formula can either be parsed in a standard or non-linear manner. That is,

gamma ~ (1 | lev2cor | gamma_groupings) + mugamma

is invalid as it both contains standard parts and “non-linear” parts (mugamma). If you want an additional formula to be parsed in a non-linear manner, just wrap it in nlf, for instance nlf(gamma ~ phi + mugamma) and then phi ~ (1 | lev2cor | gamma_groupings).

2 Likes

Thank you @bbbales2 and @paul.buerkner for your help! I implemented the changes and have had the model running for a little over a day on a small subset (~7% of the 300k observations); 3 of the 4 chains have finished sampling, the 4th is almost there but has been slow. Based on a smaller subset I ran, I am anticipating having to deal with convergence issues when it is finally done, and I’ve seen some suggestions on how to try to address those from @paul.buerkner in response to other questions (e.g., stronger priors).

Given the complexity in fitting non-linear models with brms (or any Bayesian approach, in R or otherwise), I have a follow-up question that I perhaps should have asked at the outset: does the model I specified in my first post require fitting with non-linear syntax, or would substituting the higher-order terms into the population-level equation and fitting it with linear syntax yield a conceptually similar model?

That is to say, consider instead this model:

brm(y ~ b1 * x1 + b2 * x2 + b3 * x3 + b4 * x4 + z1 + x2:z1 + x3:z1 + x4:z1 + (1|alphacor|alpha_groupings) + (1 + x2 + x3 + x4 + x2:z1 + x3:z1 + x4:z1 | lev2cor | level2_groupings) + (1 + x2 + x3 + x4 + x2:z1 + x3:z1 + x4:z1| lev3cor | level3_groupings),
data= data, family=bernoulli(link = “logit”),
prior=priors,
iter = 2000, chains = 4, cores=4, refresh = 10, seed = 13)

Following the conventions of lme4 syntax pertaining to including group-level predictors, this formula places the level 3 predictor (z1) and cross-level interaction terms (x2:z1, x3:z1, x4:z1) at the population-level, and includes offsets for the intercept, slopes on the level 1 covariates of interest (x2, x3, and x4), and slopes on the cross-level interaction terms (x2:z1, x3:z1, x4:z1).

I was initially working with this formulation, and was able to fit models that converged using a subset of the data (~20%) but backed off and switched to the non-linear formulation when I looked at the Stan code produced to fit this brms model because, in the Stan code, z1 is included as a level 1 covariate that is not used to inform the estimation of the level 3 (and subsequently level 2) offsets, rather than as a level 3 covariate that is used to inform the estimation of those offsets.

Am I right to have switched to the non-linear formulation, given my inferences of interest (i.e., the interaction of z1 with x2, x3, and x4)? Or am I making this unnecessarily complicated by moving to the non-linear formulation?

Due to the complexity of your model, I am not 100% sure if is expressable as a linear formula, but I believe it should be expressable as one. I can’t tell exactly whether the linear formula you provided but in general I would aim for a linear formulation if possible as it will likely speed up sampling by a lot since brms can perform some optimization in this case.

1 Like

@paul.buerkner, thank you for responding again.

I reworked the model to express it with a linear formula; does this and the final brm formula look right to you?

To simplify the algebra, consider the same model as the original but with only one covariate with a varying slope:

where all \mu and \delta are grand means at their respective levels, and all \eta and \nu are offsets.

Substitute the level 3 expressions into the level 2 equations:

Capture1

Then substitute both sets of level 2 expressions into the level 1 equation:

Then expand terms and rearrange:

Looking at this final re-expression, and taking the terms line-by-line, we have:

  • The grand intercept and the intercept offsets (\mu_{\alpha 0}+\delta_{\mu_{\alpha}0}+\mu_{\gamma_{0}} and \nu_{\mu_{\alpha}k[j[i]]}+\eta_{\alpha j[i]} + \eta_{\gamma_{0}s[t]}, respectively)
  • The covariate x_{1} and its non-varying coefficient \beta_{1}
  • The grand slope and offsets for the covariate x_2 (\mu_{\beta_{2} 0}\cdot \text{x}_{2i} + \delta_{\mu_{\beta_{2}} 0} \cdot \text{x}_{2i} and \nu_{\mu_{\beta_{2}}k[j[i]]} \cdot \text{x}_{2i} +\eta_{\beta_{2}j[i]} \cdot \text{x}_{2i}, respectively)
  • The group-level covariate z_{1} and its coefficient \delta_{\mu_{\alpha}1}
  • The cross-level interaction between z_{1} and x_{2} and its coefficient \delta_{\mu_{\beta_{2}} 1}

Given this, I think the brm formula needs to specify varying slopes for the level 1 covariates, but not for the cross-level interaction terms, so, with all three covariates with varying slopes:

brm(y ~ x1 + x2 + x3 +x4 + z1 + x2:z1 + x3:z1 + x4:z1 + 
(1|alphacor|alpha_groupings) + 
(1 + x2 + x3 + x4 | lev2cor | level2_groupings) + 
(1 + x2 + x3 + x4 | lev3cor | level3_groupings),
data= data, family=bernoulli(link = “logit”),
prior=priors,
iter = 2000, chains = 4, cores=4, refresh = 10, seed = 13)

This looks correct to me, although I can’t tell with absolute certainty as this indexing war is no fun :-)

1 Like

Understood. If only we had a universal indexing system… ;) Thank you for your help! Models are running, fingers crossed!

1 Like