Credible intervals changing depending on the method used to find them

Sorry, your question fell through a bit. Thanks for being investigative about this, is indeed a bit weird and I can’t rule out that it is a symptom of a bug.

First a bit of conceptual suggestion:

Generally we discourage people from boolean “effect exists” vs. “effect doesn’t exist” judgements - the very fact that you put the interaction in you model means that mathematically, the effect can never be estimated as exactly zero. Depending on the posterior CI, what you can say is one of:

  • “we cannot learn much about the effect from the data” (when the posterior is very wide and includes both positive and negative values)
  • “there is some evidence for a positive/negative effect” (when the posterior CI excludes zero) - still, if the posterior includes values close to zero, you can’t rule out the effect is actually negligible in practice
  • “there is evidence against a substantial effect” (when the posterior is tightly concentrated around zero and thus excludes meaningful effect sizes on both sides - what is “meaningful” or “substantial” effect would depend on the domain.

Also remember that the 95% intervals can be a bit sensitive to the stochasticity in MCMC (unless you run a lot of iterations) and to minor changes in the input data, so for practical purposes, the difference between a CI of say -5.1, 0.01 and -5.04, -0.03 would usually be completely irrelevant - in both cases there is substantial posterior probability that the effect is negative, data are consistent with a strongly negative effect but a negligible effect cannot be ruled out.

All that assumes your model is a good model for the data, which you should check (e.g. using the pp_check functionality)

Getting to your actual question:

I would expect hypothesis(param < 0) to return the same CIs as summary for that parameter, so I am not sure what is going on here, I asked a separate question on this at Why does hypothesis return different CIs than summary?

You’ll also notice that for the first and second hypothesis, you get exactly the same CIs as in the summary (because all the terms except for the interaction cancel out).

I would expect hypothesis to do exactly the same thing as you did with the samples, so I think the reason for the discrepancy is the same.

So while I share your concern about the numerically different results, I don’t think the differences in the CIs would warrant notably different conclusions.

3 Likes