I am struggling to wrap my head around the difference between fitted
and predict
. In my case, it is about the multinomial family. There are a few relevant posts here and on GitHub. After reading them and pp_expect.R
with posterior_predict.R
, my current understanding is that fitted
gives draws of the expected value of the posterior distribution, while predict
gives draws from the posterior distribution. How far am I from the truth? If not too far, should one expect the means of the two to match? If I take the average of the draws from fitted
and the average of those from predict
, should they not be the same? In my example, they deviate.
1 Like
You are correct. The deviation are likely because of random error in the posterior predictions.
Thank you for the quick reply. I guess I misuse some of the functions then. Let me give a minimal working example:
data <- tribble(
~a, ~b, ~c, ~total,
3000, 3000, 7000, 13000,
)
data$counts <- with(data, cbind(a, b, c))
formula <- brmsformula(counts | trials(total) ~ 1)
fit <- brm(formula, data, multinomial(), seed = 42)
data$counts / data$total
# a b c
# [1,] 0.2307692 0.2307692 0.5384615
colMeans(fitted(fit, summary = FALSE)) / data$total
# a b c
# [1,] 0.2309553 0.2307284 0.5383163
colMeans(predict(fit, summary = FALSE)) / data$total
# a b c
# [1,] 0.1364585 0.2725709 0.5909706
This difference cannot arguably be attributed to chance. I have tried with different seeds. In addition, if it was about chance, it would go away with the sample size.
I will take a look. Thanks!
Thanks! It turned out to be a slightly embarassing typo as I wrote pcategorical
instead of dcategorical
so accidentally used the cumulative distribution function… Should be fixed now on github.
That was fast! Thank you so much!