So one way to do this with multivariate data in `brms`

is to convert it to “long format” (each dimension-value pair gets its own row). Than one can do “sort-of” multivariate model by having a varying intercept per each row in the original data. The model is not fully identified by default, but one can fix the `sd`

of the varying intercept to make it identifiable.

So assuming simulated data like:

```
rcumulative_logit <- function(N, mean, thresholds) {
raw <- rlogis(N) + mean
res <- rep(1, N)
for(t in thresholds) {
res <- res + (t < raw)
}
res
}
set.seed(58224522)
N_patients <- 7
simple_test <- data.frame(patient = factor(1:N_patients)) %>%
crossing(time_from_diagnosis = c(0, 3, 5)) %>%
mutate(q1 = rcumulative_logit(n(), rnorm(N_patients)[as.integer(patient)] + time_from_diagnosis * 0.5, thresholds = c(-1,-0.3,0,1)),
q2 = rcumulative_logit(n(), rnorm(N_patients, sd = 0.5)[as.integer(patient)] + time_from_diagnosis * 0.1, thresholds = c(-0.4,-0.1,0.5,1.5)),
q3 = rcumulative_logit(n(), rnorm(N_patients, sd = 2)[as.integer(patient)] + time_from_diagnosis * 0.3, thresholds = c(-2,-0.8,-0.2,1.1)),
obs_id = 1:n()
)
simple_test_long <- simple_test %>%
pivot_longer(starts_with("q"), names_to = "question", values_to = "answer")
```

`obs_id`

now binds all the response that were previously the same row.

The idea is that we add a varying intercept that corresponds to the correlated residuals, i.e. we add one parameter per observed answer as a varying intercept of the form `(0 + question | obs_id)`

where `obs_id`

uniquely identifies each row in the wide dataset. To keep the model identified, we fix the `sd`

of this varying intercept.

We then increase the `disc`

parameter of the `cumulative("logit")`

distribution (which I currently do by adding `disc ~ 1`

to the formula and setting a constant prior on this intercept. This is *almost* the same as using actual multivariate distribution, the difference is that the observation noise now has two components: one independent from the logistic distribution of `ordered_logistic`

(with roughly `sd = 1/disc`

) and one correlated from the varying intercept (where we set the `sd = 1`

). So as `disc -> inf`

we get a closer and closer approximation to multivariate probit, but also increase risk of computational issues. `disc = 10`

seems to work quite OK.

Note: it does not really matter which link we put in the `cumulative`

response (as we are trying to minimize it’s effect), but the `logit`

link is much more numerically stable than the `probit`

link, so we use it. Still, this is a `probit`

model, because the main noise component is normally distributed.

```
fit_thres_rescor_approx <- brm(bf(
answer | thres(gr = question) ~ time_from_diagnosis + (1 | patient) + (0 + question | obs_id), disc ~ 1, family = cumulative(link = "logit", link_disc = "identity")),
prior = c(
prior(constant(1), class = "sd", group = "obs_id"),
prior(student_t(3, 0, 5*2.5), class = "Intercept"),
prior(constant(10), class = "Intercept", dpar = "disc")
),
data = simple_test_long
)
```

I know this is not a very good approximation if you have many questions (roughly more than 5, but didn’t investigate thoroughly), still it models *some* correlation even in the higher-dimensional case. The model will also have trouble initializing properly for large data - reducing the `constant(10)`

makes it better behaved but also further from a pure multivariate probit. But is multivariate, ordinal, correlated, works with `brms`

out of the box and I verified it is a faithful approximation to multivariate probit for 3 questions.

For fitting, you simply drop the rows with missing question-answer pairs from the long dataset. To impute them, you predict for the whole dataset.

This code was developed and tested as a part of ongoing project with @jroon, so if you end up using it, some form of acknowledgement would be great :-) (I feel a bit shameful asking for this, but my employer is recently being a bit more inquisitive about my publication output).

You can hack `brms`

to work with latent variables - actually working on this in the project, but I don’t need to handle missing values which would complicate things another bit. However, it gets quite complex quite fast and requires you to be comfortable with both `brms`

internals and coding in Stan, so if the above version works well for you, I would stick with it.

I also took the liberty to move this to a new thread as I think it is getting far from the original question.