Hi,
We have fitted a multilevel binomial regression model to our data with one of the predictors being an ordinal variable with the following formula:
ResponseN ~ 0 + Bias:DistanceN:Language + Bias:DistanceN:mo(Step) + Bias:DistanceN:Language:mo(Step) + (0 + Bias:DistanceN + Bias:DistanceN:mo(Step) | SubjectID)
The response is coded as 0 or 1.
The ordinal variable Step has five values in our data (1, 4, 5, 6, 10). All the categorical predictors have two levels (Bias: taendt and sendt, DistanceN: NEAR and FAR, Language: Norwegian and Danish). Now we want to test hypotheses at the midpoint (i.e. Step = 5). For this, we’re trying to construct the posterior estimates at the mid-point with the following code :
post <- posterior_samples(model)
post <- post %>% mutate(
tNEARdkmid = b_Biastaendt:DistanceNNEAR:LanguageDanish
+ simo_Biastaendt:DistanceNNEAR:moStep1[1]
+simo_Biastaendt:DistanceNNEAR:moStep1[2]
+simo_Biastaendt:DistanceNNEAR:moStep1[3]
+simo_Biastaendt:DistanceNNEAR:moStep1[4]
,
sNEARdkmid = b_Biassendt:DistanceNNEAR:LanguageDanish
+ simo_Biassendt:DistanceNNEAR:moStep1[1]
+simo_Biassendt:DistanceNNEAR:moStep1[2]
+simo_Biassendt:DistanceNNEAR:moStep1[3]
+simo_Biassendt:DistanceNNEAR:moStep1[4]
,
tFARdkmid = b_Biastaendt:DistanceNFAR:LanguageDanish
+ simo_Biastaendt:DistanceNFAR:moStep1[1]
+simo_Biastaendt:DistanceNFAR:moStep1[2]
+simo_Biastaendt:DistanceNFAR:moStep1[3]
+simo_Biastaendt:DistanceNFAR:moStep1[4]
,
sFARdkmid = b_Biassendt:DistanceNFAR:LanguageDanish
+ simo_Biassendt:DistanceNFAR:moStep1[1]
+simo_Biassendt:DistanceNFAR:moStep1[2]
+simo_Biassendt:DistanceNFAR:moStep1[3]
+simo_Biassendt:DistanceNFAR:moStep1[4]
,
tNEARnomid = b_Biastaendt:DistanceNNEAR:LanguageNorwegian
+ simo_Biastaendt:DistanceNNEAR:moStep:LanguageNorwegian1[1]
+simo_Biastaendt:DistanceNNEAR:moStep:LanguageNorwegian1[2]
+simo_Biastaendt:DistanceNNEAR:moStep:LanguageNorwegian1[3]
+simo_Biastaendt:DistanceNNEAR:moStep:LanguageNorwegian1[4]
,
sNEARnomid = b_Biassendt:DistanceNNEAR:LanguageNorwegian
+ simo_Biassendt:DistanceNNEAR:moStep:LanguageNorwegian1[1]
+simo_Biassendt:DistanceNNEAR:moStep:LanguageNorwegian1[2]
+simo_Biassendt:DistanceNNEAR:moStep:LanguageNorwegian1[3]
+simo_Biassendt:DistanceNNEAR:moStep:LanguageNorwegian1[4]
,
tFARnomid = b_Biastaendt:DistanceNFAR:LanguageNorwegian
+ simo_Biastaendt:DistanceNFAR:moStep:LanguageNorwegian1[1]
+simo_Biastaendt:DistanceNFAR:moStep:LanguageNorwegian1[2]
+simo_Biastaendt:DistanceNFAR:moStep:LanguageNorwegian1[3]
+simo_Biastaendt:DistanceNFAR:moStep:LanguageNorwegian1[4]
,
sFARnomid = b_Biassendt:DistanceNFAR:LanguageNorwegian
+ simo_Biassendt:DistanceNFAR:moStep:LanguageNorwegian1[1]
+simo_Biassendt:DistanceNFAR:moStep:LanguageNorwegian1[2]
+simo_Biassendt:DistanceNFAR:moStep:LanguageNorwegian1[3]
+simo_Biassendt:DistanceNFAR:moStep:LanguageNorwegian1[4]
)
Now what is puzzling us is that the mid-point estimates end up being negative. But when we plot the raw data or the predicted values, we see that at the mid-point the response “1” is above 50% and therefore it has to be positive in the log-odd space.
So my question is are we not interpreting our model correctly? What are we doing wrong that results in negative estimates at Step = 5?
Thanks in advance!