I have a question about the use of emmeans package with brm hurdle models from the brms package. To illustrate the issue/question, I have modified the epilepsy example provided in brm help page. I am using the variable visit as a factor to illustrate the issue I am having.
Modifed example (with code in bold and output in normal text):
**prior1 ← prior(normal(0,10), class = b) **
fit1 ← brm(bf(count ~ visit, hu ~ visit), data = epilepsy, family = hurdle_poisson(), prior = prior1, cores=4)
Thus, visit is in both components of the hurdle model and my question is about the combined emmeans:
#Hurdle Component:
epilepsy %>% group_by(visit) %>% summarise(mean(count==0))
A tibble: 4 x 2
visit mean(count == 0)
1 1 0.0678
2 2 0.0678
3 3 0.136
4 4 0.119
#conditional effects
conditional_effects(fit1,‘visit’,dpar=‘hu’)$visit # probability of a zero -
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 0.06611675 0.03196074 0.02133282 0.1453699
2 2 8.254237 1 2 0.06708003 0.03115784 0.02179011 0.1442654
3 3 8.254237 1 3 0.13616771 0.04401201 0.06645854 0.2347462
4 4 8.254237 1 4 0.11832673 0.04085320 0.05386295 0.2143953
conditional_effects(fit1,‘visit’,dpar=‘hu’,method=“posterior_linpred”)$visit # logit of the probability of a zero
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 -2.647929 0.5274932 -3.825945 -1.771387
2 2 8.254237 1 2 -2.632433 0.5036723 -3.804268 -1.780306
3 3 8.254237 1 3 -1.847491 0.3743364 -2.642407 -1.181703
4 4 8.254237 1 4 -2.008372 0.3951938 -2.865945 -1.298632
**# emmeans **
emmeans(fit1,~visit,dpar=‘hu’,type=‘response’)
visit response lower.HPD upper.HPD
1 0.0661 0.0190 0.139
2 0.0671 0.0184 0.136
3 0.1362 0.0578 0.223
4 0.1183 0.0453 0.202
Point estimate displayed: median
Results are back-transformed from the logit scale
HPD interval probability: 0.95
emmeans(fit1,~visit,dpar=‘hu’)
visit emmean lower.HPD upper.HPD
1 -2.65 -3.74 -1.72
2 -2.63 -3.71 -1.74
3 -1.85 -2.60 -1.15
4 -2.01 -2.82 -1.28
Point estimate displayed: median
Results are given on the logit (not the response) scale.
HPD interval probability: 0.95
For the hurdle component of the model we can see agreement between conditional_effects and emmeans. 😉
Conditional Abundance:
**# conditional abundance **
subset(epilepsy,count > 0) %>% group_by(visit) %>% summarise(mean(count))
A tibble: 4 x 2
visit mean(count)
1 1 9.6
2 2 8.96
3 3 9.73
4 4 8.29
conditional_effects(fit1,‘visit’,dpar=‘mu’,type=‘response’)$visit #
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 9.603691 0.4057366 8.776359 10.432075
2 2 8.254237 1 2 8.962067 0.4231933 8.169387 9.779043
3 3 8.254237 1 3 9.717162 0.4372259 8.897996 10.577895
4 4 8.254237 1 4 8.283749 0.4125866 7.530579 9.084487
conditional_effects(fit1,‘visit’,dpar=‘mu’,method=“posterior_linpred”)$visit
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 2.262148 0.04226386 2.172062 2.344885
2 2 8.254237 1 2 2.193001 0.04699969 2.100394 2.280242
3 3 8.254237 1 3 2.273894 0.04487049 2.185826 2.358766
4 4 8.254237 1 4 2.114296 0.04998673 2.018972 2.206568
# emmeans
emmeans(fit1,~visit,dpar=‘mu’,type=‘response’)
visit rate lower.HPD upper.HPD
1 9.60 8.75 10.38
2 8.96 8.15 9.74
3 9.72 8.90 10.58
4 8.28 7.52 9.07
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
emmeans(fit1,~visit,dpar=‘mu’)
visit emmean lower.HPD upper.HPD
1 2.26 2.18 2.35
2 2.19 2.10 2.28
3 2.27 2.19 2.36
4 2.11 2.02 2.20
Point estimate displayed: median
Results are given on the log (not the response) scale.
HPD interval probability: 0.95
Again agreement between the model components 😊
The combined effect of the hurdle component of the model and the conditional mean component of the model:
subset(epilepsy) %>% group_by(visit) %>% summarise(mean(count))
A tibble: 4 x 2
visit mean(count)
1 1 8.95
2 2 8.36
3 3 8.41
4 4 7.31
conditional_effects(fit1,‘visit’,type=‘response’)$visit
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 8.931023 0.4920391 7.928918 9.901036
2 2 8.254237 1 2 8.341463 0.4902433 7.382581 9.249976
3 3 8.254237 1 3 8.360136 0.5715124 7.255557 9.418519
4 4 8.254237 1 4 7.274172 0.5011159 6.298681 8.216822
conditional_effects(fit1,‘visit’,method=“posterior_linpred”)$visit
visit count cond__ effect1__ estimate__ se__ lower__ upper__
1 1 8.254237 1 1 2.262148 0.04226386 2.172062 2.344885
2 2 8.254237 1 2 2.193001 0.04699969 2.100394 2.280242
3 3 8.254237 1 3 2.273894 0.04487049 2.185826 2.358766
4 4 8.254237 1 4 2.114296 0.04998673 2.018972 2.206568
emmeans(fit1,~visit,type=‘response’)
visit rate lower.HPD upper.HPD
1 9.60 8.75 10.38
2 8.96 8.15 9.74
3 9.72 8.90 10.58
4 8.28 7.52 9.07
Point estimate displayed: median
Results are back-transformed from the log scale
HPD interval probability: 0.95
emmeans(fit1,~visit)
visit emmean lower.HPD upper.HPD
1 2.26 2.18 2.35
2 2.19 2.10 2.28
3 2.27 2.19 2.36
4 2.11 2.02 2.20
Point estimate displayed: median
Results are given on the log (not the response) scale.
HPD interval probability: 0.95
We can see that the conditional_effects function gives sensible answers ie. Conditional_abundance X ( 1- prob of a zero). However, emmeans ignores the probability of a zero. How does one get emmeans to give unconditional means, that is ones that incorporate the two model components.?
Any guidance would be greatly appreciated.
All the best,
Wade Blanchard