Trial types as random effects?

Hello everyone,

I am new, both to bayes statistic in general (using brms) and this community. So I am sorry if (a) I did not use the right tags and (b) the answer is too obvious.

I have data from 3 groups (n = 30 each). Each individual completed 24 trials. The 24 trials consisted of 4 scenarios (6 trials each). These 6 trials consisted of 3 types (2 trials each). These 2 different trials differed in the sequence in which stimuli were presented.

I am primarily interested in the group * scenario interaction.

So I could fit:

dependent_variable ~ 1 + group*scenario + (1 | ID)

However, I thought that actually there are not only trials nested in participants. Actually, sequences are nested in types which are nested in scenarios which are nested in participants (see above). So I tried:

dependent_variable ~ 1 + group*scenario + (1 | sequence/type/ID)

Note that I did not model scenario as random, as its already included in the fixed effects. In the second model the results are far more „significant“.

My questions are:

  • Which model should I prefer? On the one side I want to account for the data structure as good as I can. On the other side I am not sure, if it makes sense the way I did it, especially because the results are so much better.

I am very grateful for your answers!

Yours

Simon

1 Like

Hi Simon,

Don’t worry about the answer seeming obvious – your question actually digs into a lot of the nuance of modeling heterogeneity across differing contexts! I will try to write a bit more later but I would suggest checking out Ch 13 of “Data Analysis Using Regression and Multilevel/Hierarchical Models” by Andrew Gelman and Jennifer Hill and/or Michael Betancourt’s factor modeling case study (Impact Factor) in the interim.

I would also caution against preferring a model because it produces “significant” effects of interest. Building a model that fits the data well and then estimating effect sizes and their uncertainty is usually better.

2 Likes

Good Morning js592,

thank you for your answer. I will follow your Book recommondation, maybe things will become clearer then…

Yours,

Simon

Alright, after some more research I realized that, if sequences are nested in types which are nested in participants and I am interested in the group*scenario interaction I should write:

dependent variable ~ 1 + group * scenario + (1 | ID/type/sequence)

instead of

dependent variable ~ 1 + group * scenario + (1 | sequence/type/ID)

So, the formula is … (1 | Level 4/Level 3/Level 2), right?

Then, my results are very similar to

dependent variable ~ 1 + group * scenario + (1 | ID)

which makes sense. Now I have to think about my distribution because the dependent variable is bounded count data between 0 and 9 and both, gaussian distribution and (truncated) poisson produce horrible posterior predictive checks. Well, this is another problem…

If you have some thoughts to share I am very interested in hearing them - otherwise already a big thanks for the useful links…

Simon

Hi Simon,

If you have several models fitted to the same data, PSIS-LOO can provide an objective comparison between models. brms provides some handy helper functions: add_criterion and loo_compare. The comparison works across different parameterizations, varying effects structures, and distributions.

2 Likes

I don’t know you dependent variable so I don’t know if it makes sense, but have you thought about a binomial distribution (or possible a beta binomial for allowing overdispersion)? When comparing models with loo, keep in mind that it cannot easily compare models with discrete and continuous response distributions to one another ( Cross-validation FAQ )

2 Likes