I have another question about model interpretation. I have my final models built and thought that I understood how to interpret them but played around a little bit and now I am confused again.
The model I have is this using a beta likelihood with a logit link:
EXAM25 ~ 1 + Algorithm + LOC + (1|Project) + (1|Language)
Algorithm has 2 levels and LOC is continuous.
mcmc_areas gives this for the logit scale which shows a clear difference between the two algorithms on the logit scale:
marginal_effects shows this picture which looks a lot less certain in the difference, although Linespots seems to have lower EXAM25 than Bugspots:
Now there are two ways to calculate the contrasts that I have seen. One is based on the
posterior_sample and one on the
posterior_sample based one looks like this (for the mean LOC):
post = posterior_predict(model) contrast = inv_logit_scaled(post$b_Intercept) - inv_logit_scaled(post$b_Intercept + post$b_AlgorithmBugspots )
and looks like this:
Again, a clear difference between both algorithms on the outcome scale.
The posterior_predict one looks like this (I have a full factorial design so both subsets look the same besides results and Algorithm):
l = posterior_predict(model, newdata = subset(data, d$Algorithm == "Linespots")) b = posterior_predict(model, newdata = subset(data, d$Algorithm == "Bugspots")) contrast = l - b
This contrast however looks very different from the ones before:
Now I am wondering what is going on here. Is this due to the
posterior_sample contrast only looking at mean LOC and mean project and language (as in 0) while the
posterior_predict contrast aggregates accross all LOO, project and language values? Or am I doing something else wrong.
I assume that there is no “right” way to do this and it depends on what exactly I want to show as it always seems. However I am not sure what I should conclude from this now.
Would it be fair to say that Linespots has lower EXAM25 for mean LOC (and I guess I could just test for some range of LOC) with differences in projects and languages skewing the results?