Hi,
sorry for not getting to you earlier, your question is relevant and well written.
First, I think the plogis
should not be very useful here: for the purpose of the hypothesis, you should get the same probabilities with nad without the plogis
transform (as it maintains ordering). Which implies that the first two hypothesis should be equivalent (you just add the same value to both sides). Since they did not give equivalent results, I assume that either
a) There is some bug in the code
b) brms
does something weird with the plogis
transform and the code does something else than what I would naively expect
c) There are numerical issues and the values of the b_Intercept[]
terms make the plogis
overflow/underflow to 1/0 so for a lot of samples the predictions are equal (if that’s the case, you should see a big difference when changing the >
in the hypothesis for >=
.
In any case, I think you are misunderstanding some of the core aspects of the model. brms
is doing a lot of stuff for you and it unfortunately can hide a lot of the important details. It IMHO makes little sense to use plogis
to transform the coefficients or their sums (plogis
can be used to transform the whole predictor into a probability, but when inspecting only a subset of the coefficients, it is IMHO more reasonable to think about those as in changes in log odds for the adjacent choices (which are themselves weird - I’ve personally never quite understood when the adjacent category model is preferable to cumulative/sequential models). Additionally, since the intercepts correspond to different categories, it IMHO does not make any sense to combine them in a single hypothesis. This holds even more true for the disc
predictors - I cannot think of a scenario when a sum of coefficients from the main linear predictors with coefficients from the disc
predictor is a useful quantity.
I invite you to (re-)check the tutorial at https://journals.sagepub.com/doi/epub/10.1177/2515245918823199 , especially the appendix as that might clarify some of the issues. Reviewing some basics of linear models (e.g. using resources at Understanding basics of Bayesian statistics and modelling) might also be sensible.
If you are not inclined towards getting deep into the model math, one needs to be extremely careful with doing inferences in such a complex model from the model coefficients alone. A relatively safe approach is to rethink the inference as a prediction task. E.g. for a model response ~ Condition
the posterior for b_Condition
coefficient corresponds to predictions of difference in means between hypothetical future replications without any noise (or - alternatively with noise, but infinite sample size).
So instead of using hypothesis
(which requires you to understand what the coefficients are actually doing in the model) you might want to create a dataset representing a hypothetical scenario you are interested in and use posetior_predict
/ posterior_epred
/ posterior_linpred
to make appropriate predictions and compare the predictions for different scenarios. There are different ways to setup the dataset for such a hypothetical scenario that would correspond to different scientific questions (e.g. do you predict for subjects with the same covariates that you observed or some new “typical” subject? do you want to include observation noise?).
Similar recommendations apply for Q2
Does that make sense?
Your model assumes that the effect of Ability
on the logit scale is the same for both groups (as you don’t model Ability:Train
interaction). You could relax this assumption and add the interaction to the model and inspect whether the posterior for the coefficient concentrates around zero. You could also use model comparison techniques the compare the model with an interaction to a model without the interaction term (see e.g. Hypothesis testing, model selection, model comparison - some thoughts for some additional discussion of the options)
Best of luck with your model!