Testing consistency of observers ordinal ratings across multiple experiments

I’m at work and unfortunately the website is blocked for your experiment and data, but I have a few ideas.
If the second experiment is basically a replication of the first, with the same observers, you might just consider adding experiment into the hierarchical model as another level. For example, you could do something like (filter|experiment/id), such that you have id’s nested within the different experiments. You need a reasonable prior on the sd for experiment, especially with only 2. This might work particularly well if you ran several more experiments as well.
You would obtain varying intercepts and slopes for each experiment and each id within experiment.
You might take a look at this thread. It’s a little lengthy and not exactly the same topic, but I ran a model for data with the design of matched pairs within experiment within object type. In that thread Aki links to examples of using CV to evaluate hierarchical models and has code to do integrated LOO, for example here, if you want to do leave-group-out CV.

1 Like