The following is a bit simplified version of a similar problem as posted earlier; perhaps some inputs here can help me to get going also on the other problem.

I have data from three similar datasets, collected in different environments {E1, E2, E3} (stored in variable dataset), in which participants (a different set/number of participants per dataset), with identifier ID, experienced similar levels of a binary factor T {0, 1} and had to decide between two options {A, B} (stored in decision).

What I want to model/investigate is the overall effect of T on participants behavior, and how much the effect differs between datasets.

My current approach is using nested random effects:

model <- brm(
formula = decision ~ dataset * T + (1|dataset:ID),
family = bernoulli(),
data = data
)

Does this sound like a sensible approach to the problem/research question at hand? I think my main challenge is to understand how to best accommodate ID and dataset in the random effects, and whether the interaction term makes sense. An alternative way I see is to make three separate models with the same formula, one per dataset, and then compare their parameter distributions; however, this may not allow me to quantify differences between models?

An alternative approach I can see would be including dataset as random effect. By setting up a nested random effect as below. However, dataset only has three individual levels, so I am not sure this would be a valid alternative.

model <- brm(
formula = decision ~ T + (1 + T|dataset/ID),
family = bernoulli(),
data = data
)

Sorry your original question slipped through the cracks. A couple of questions:

Do any IDs occur in multiple datasets?

Do you wish to assume that the random effect variance due to ID is the same for all datasets?

Note that itâ€™s straightforward to quantify the differences between models in some quantity of interest (just difference that quantity iteration-wise across the posterior samples to get a posterior distribution of the difference). However, it can be less straightforward to express a prior on that difference when fitting three different models as compared to fitting them all at once in a way that is parameterized by the difference itself.

I use the same numbering for ID in each dataset, {1, 2, â€¦, N}, but all participants were different, i.e., no one took part in more than one of the datasets.

Not sure I understand you here, but I guess this may be how I model it in the first formula (original post)? They are all contributing to one random-effect variance. Participants were different between datasets, so I donâ€™t think that all datasets should have the same variance?

This sounds reasonable to me, and I am therefore leaning toward the one-model solution.

In the random effects model, the intercepts for different participants are assumed to arise from some normal distribution whose variance is fitted from the data. Do you wish to fit a single variance across all three datasets (as in your proposed formulas) or do you wish to model the possibility that this variance itself differs by dataset (which is what you would get by fitting separate models, though this is also possible to express in brms syntax).

Also, one last question: given that the response is Bernoulli, modeling random effects by participant is likely to work only if a nontrivial number of participants were recorded making multiple decisions in the data, and in particular if a nontrivial number of participants were recorded making both possible decisions at least once. If you only have one decision per participant, then you cannot expect to fruitfully model the random effect variance except by injecting strong prior information.

To me, it sounds acceptable thinking that intercepts from all participants could come from some common normal distribution with a specific variance. Though, it sounds more useful to model this variance grouped by dataset. In that sense, another option I came across may be the following:

formula = decision ~ dataset * T + (1|gr(dataset:ID, by = dataset)),

Though, if I understood correctly, this is not a different model but the variances are calculated based on grouping dataset:ID into the three different datasets and then calculating a variance for each of them.

This may be a critical one, depending how â€śnontrialâ€ť could/should be defined (Iâ€™m maybe not experienced enough to know). In each dataset, each participant only contributed with two data points for the conditions (captured in T). I.e., two decisions per participant. Some participants decided in both runs for only one of the options, some for both. I also experienced that I had to put stronger priors, with the default ones the model did not converge. normal(0, 5) still worked, but then the fitted coefficients became much larger (which I guess is expected behavior). Does this sound like I should not even use random effects?