# Multivariate model to understand effect of different datasets using the same model formula

I am working on a model that should help me understand the differences between three similar datasets collected in different environments (`E1`, `E2`, `E3`). All three datasets contain repetitions from participants according to the different levels of a variable (`A`), which were the same across datasets. The sets of participants were, however, different across datasets.

Ideally, I would like the model to help me answer the question “How do the three environments differ from each other (overall and for `A`)?”. Furthermore, it would be great answering the question “How much do individuals differ in their behavior within the respective dataset?”. The dependent variables I am interested in have a multivariate character, therefore, I would like to combine them in a single, multivariate, model.

In a first approach, I fitted three models with the same formula, one for each dataset. This allowed me investigating the correlations between `R1` and `R2` for each separate dataset. However, this did not allow me to quantify the differences between environments.

Therefore, my approach has evolved to the following:

``````model <- brm(
data = data_combined,
formula =
mvbind(R1, R2) ~ A * dataset + (1|dataset:ID) +
set_rescor(TRUE)
)
``````

where the interaction `A * dataset` should give me information about the effect of the environment on `A` and overall (on the intercept). I used the random effect `dataset:ID` since the participants are different between datasets. I may eventually try to expand the random effect to include the interaction, i.e., `(1 + A * dataset|dataset:ID)`, but I don’t have many observations per participant (only one per level of `A`), so I expect the model may not converge.

`data_combined` looks like this:

``````dataset | ID | A |  R1 |  R2
-----------------------------
E1      |  1 | 1 | 1.2 | 2.6
E1      |  1 | 2 | 0.9 | 2.1
E1      |  1 | 3 | 0.8 | 2.2
E1      |  2 | 1 | 3.1 | 4.1
E1      |  2 | 2 | 4.2 | 3.4
.
.
.
E2      |  1 | 1 | 2.1 | 5.4
.
.
.
E3      |  1 | 1 | 2.3 | 4.4
.
.
.
``````

What this approach could not give me is the difference in residual correlations (between `R1` and `R2`) between the environments.

Does this approach sound reasonable to tackle the problem at hand, or could there be a better way? How could I go after comparing the variability between individuals across datasets? I think an intra-class correlation (ICC) coefficient may be able to do that, but I would somehow need to calculate it per dataset, given the combined model.

Any thoughts are highly welcome. :-)

• Operating System: Windows 10
• brms Version: 2.18.0