Multivariate model to understand effect of different datasets using the same model formula

tester · November 9, 2023, 5:22pm

I am working on a model that should help me understand the differences between three similar datasets collected in different environments (E1, E2, E3). All three datasets contain repetitions from participants according to the different levels of a variable (A), which were the same across datasets. The sets of participants were, however, different across datasets.

Ideally, I would like the model to help me answer the question “How do the three environments differ from each other (overall and for A)?”. Furthermore, it would be great answering the question “How much do individuals differ in their behavior within the respective dataset?”. The dependent variables I am interested in have a multivariate character, therefore, I would like to combine them in a single, multivariate, model.

In a first approach, I fitted three models with the same formula, one for each dataset. This allowed me investigating the correlations between R1 and R2 for each separate dataset. However, this did not allow me to quantify the differences between environments.

Therefore, my approach has evolved to the following:

model <- brm(
    data = data_combined,
    formula = 
        mvbind(R1, R2) ~ A * dataset + (1|dataset:ID) +
        set_rescor(TRUE)
)

where the interaction A * dataset should give me information about the effect of the environment on A and overall (on the intercept). I used the random effect dataset:ID since the participants are different between datasets. I may eventually try to expand the random effect to include the interaction, i.e., (1 + A * dataset|dataset:ID), but I don’t have many observations per participant (only one per level of A), so I expect the model may not converge.

data_combined looks like this:

dataset | ID | A |  R1 |  R2
-----------------------------
E1      |  1 | 1 | 1.2 | 2.6
E1      |  1 | 2 | 0.9 | 2.1
E1      |  1 | 3 | 0.8 | 2.2
E1      |  2 | 1 | 3.1 | 4.1
E1      |  2 | 2 | 4.2 | 3.4
.
.
.
E2      |  1 | 1 | 2.1 | 5.4
.
.
.
E3      |  1 | 1 | 2.3 | 4.4
.
.
.

What this approach could not give me is the difference in residual correlations (between R1 and R2) between the environments.

Does this approach sound reasonable to tackle the problem at hand, or could there be a better way? How could I go after comparing the variability between individuals across datasets? I think an intra-class correlation (ICC) coefficient may be able to do that, but I would somehow need to calculate it per dataset, given the combined model.

Any thoughts are highly welcome. :-)

Operating System: Windows 10
brms Version: 2.18.0

Topic		Replies	Views
How to compare the effect of different datasets in model Modeling brms	6	454	January 26, 2024
Multivariate formula with different number of observations brms cognitive-science	28	2432	August 5, 2020
Correlated random effects vs multivariate models brms brms	0	183	April 30, 2024
Model both response and between group correlations - multivariate brms brms	2	966	December 7, 2020
Interpreting multivariate versus univariate models brms	3	1647	June 25, 2020

Multivariate model to understand effect of different datasets using the same model formula

Related topics