Thanks for the suggestion Martin. I am familiar with the vignette and was looking to do something different (because the methods shown there wonâ€™t work for my problem).

I finally had time to look into this and explore how to use the `stanvars`

function to do what I want to do.

Before presenting my (half working) solution, Iâ€™ll recap the problem:

I want to estimate the total effect of e on y, but y has missing data.

The full data generating process is (omitting error variances):

m \sim N(e,\sigma_m) \\ y \sim N(e\cdot l + m,\sigma_y) where l\cdot e stands for the interaction of variables l and e. Further, the probability of missing data is

P_{miss_y} \sim bern(inv.logit(m + l))

Inspired by the full information maximum likelihood (FIML) approach to dealing with missing data, one way to deal with missing data is to simultaneously

- imputed the missing values and
- estimate y \sim e
- estimate the co-variances of y,m,e,l

Such a model can be implemented in brms as follows (with all variables in the data.frame `my_data`

):

(a) Specify code to be added to the parameters and model blocks of the basic brms model.

```
parameters_code = "
corr_matrix[4] Omega;
vector<lower=0>[4] tau;
vector[4] mus;
"
```

Variables are now in capital letters. `Yl`

is the brms-generated variable with imputed missing values.

```
model_code = "
mus ~ normal(0,2);
tau ~ normal(0,2);
Omega ~ lkj_corr(2.0);
for (i in 1:N) {
target += multi_normal_lpdf(to_vector({M[i],iEL[i],Yl[i]}) |
mus,
quad_form_diag(Omega,tau));
}
"
```

put model code to be added and auxiliary variables in a `stan_vars`

object:

```
stanvars =
stanvar(x = as.vector(my_data$E),
name = "E") +
stanvar(x = as.vector(my_data$M),
name = "M") +
stanvar(x = as.vector(my_data$E*my_data$L),
name = "iEL") +
stanvar(block = "parameters",
scode = parameters_code) +
stanvar(block = "model",
scode = model_code)
```

(Try to) run the modified brms model-

```
sf <- brm(bf(Y | mi() ~ E),
data = mar_interaction_data,
stanvars = stanvars)
```

I write â€śtry to runâ€ť because there remain 2 problems:

- brms puts the additional model code between the variable declarations and the rest of the original brms code, i.e. before missing values are put into the vector
`Yl`

. I can work around this by manually modifiying the brms code, but maybe @paul.buerkner has an idea that is more elegant than what is described here?
- Even if there is an elegant way to solve this coding problem, I found that the estimation of the co-variance and the effect model interfere with each other. Only when I multiply the log posterior for the covariance estimation with a large number (e.g. 100) am I able to recover the correct parameter values. Iâ€™ll describe this problem in a different thread.