IRT: Discrepancy in estimates hypothesis & conditional effects

Hi all,

We have been trying out IRT modelling as perfectly demonstrated by @paul.buerkner, but are having some issues with interpreting the estimates resulting from the fitted model. In particular, we are finding a discrepancy between estimates resulting from hypothesis() and from the conditional effects plots, which we don’t understand.

We are not sure whether these discrepancies come from a bug or whether there is a conceptual misunderstanding on our part.

Hopefully someone can help with this.

This is the model:

formula <- bf(
Acc ~ 0.5 + 0.5 * inv_logit(eta),
eta ~ 1 + pairType * (MDT + MPT) + (1 | item) + (1 | participant), 
nl = TRUE 
family <- brmsfamily("bernoulli", link = "identity")

priors <- 
prior("normal(0, 5)", class = "b", nlpar = "eta") +
prior("constant(1)", class = "sd", group = "participant", nlpar = "eta") + 
prior("normal(0, 3)", class = "sd", group = "item", nlpar = "eta") 

fit <- brm(formula = formula,
                data = data,
                family = family,
                prior = priors,
                iter = 10000, 
                warmup = 1000,
                cores = 4,
                control = list(adapt_delta = 0.99, max_treedepth =12))

Note that we removed the discrimination parameter, as all items are very similar and adding this parameter did not improve the model. Pairtype is a factor consisting of three levels, MPT and MDT are two continuous variables, both scaled with possible values ranging from -4 to 4 and centred at zero. The response variable is Acc (accuracy with possible values of 0 and 1).

The problem occurs when we want to calculate evidence ratios or posterior probabilities and compare these with visualisation of the effects from conditional effects.

For example, if we want to check the posterior probability of the Intercept (which is pairtype 1 with MPT and MDT centred at zero), we would run this hypothesis:

hypothesis(fit, “eta_Intercept = 0”)

The estimate we get is -2.45, which we then run through the inv_logit transform in order to be able to compare it with the estimates in the conditional effects plots.

So that will be:
(inv_logit(-2.45) * 0.5) + 0.5 = 0.5397193

If we then check the estimate used to create the Intercept in the conditional effects plots, this value is 0.544.

If we repeat the same procedure with the other pair types, the values are also off, although each time by different amounts. To our understanding, these estimates should be the same.

Does anyone know where these discrepancies come from?

1 Like

Just for clarification, what are you using from the hypothesis() output? I would think that, if you just want the estimate of the intercept for the eta parameter, then you’d use summary() or coef(). Once you start adding in the different factor levels and such, then you could transition to directly computing those values from the posteriors (e.g., samples <- posterior_samples(fit))

In brms, hypothesis() is, in part, a convenience function for combining posterior distributions. The problem identified in the original post is that we would expect the outputs of hypothesis and conditional_effects to be identical here, but they are not. I wonder if this is related to Ordinal Regression - conditional effects discrepancy?

Looking at the source code for hypothesis(), it seems that it computes the fixed effects from the posterior samples. If I had to guess, conditional_effects() is probably including random effects estimates as well. I’d have to double check, but I think that conditional_effects() defaults to the posterior predictive distribution whereas hypothesis() is just the posterior sample. Checking the arguments for conditional_effects() it seems like there are a few things that might be useful to check. For starters, if you specify robust = FALSE, then that might correct the issue (i.e., returning the mean rather than median). Additionally, specifying method = posterior_linpred should return the predictions based on just the linear components. The default for brms is the use of posterior_epred (expected values from the posterior predictive sample)

Thank you – some good insights! Looking at the documentation, by default, random effects are not included in conditional_effects but it seems the default is robust = TRUE (whoda thunk!). So, that seems highly probable as the cause for the discrepancy (given the skewed distributions in a model such as this).

Thanks for your suggestions.

I tried several robuts = FALSE, which slightly changed the estimates but not enough to explain the discrepancy, and in the wrong direction as well. Specifying method = posterior_linpred did not change anything. Also tried to set re_formula = NULL as suggested in Ordinal Regression - conditional effects discrepancy , but that also did not help unfortunately.

Not sure what is going on here!

This is an overview of things we’ve tried:

  1. Tried robust = FALSE in conditional_effects (did not help, made discrepancy slightly worse)
  2. Tried to include random effects by setting re_formula = NULL in conditional effects (did not help, made the discrepancy worse)
  3. Made sure that continuous predictors are centred and set conditions to 0 for both continuous predictors in conditional effects
  4. Thought it may be an issue related to the numbers being rounded up or down in the hypothesis value, so re-calculated estimates with sufficient decimal places (was not the problem)

@paul.buerkner do you have any idea of what might be happening here?

You cannot non-linearily transform posterior means after summary and expect to get the correct reults (Jensen’s inequality). Always transform on the level of posterior samples (or within hypothesis) and then summarize in the end. Try out

hypothesis(fit, “inv_logit(eta_Intercept) = 0”)

or whatever quantity you are interested in.


Thanks, that works perfectly!