Please also provide the following information in addition to your question:
- Operating System: Win 10
- brms Version: 2.10
Hi.
In the case of categorical models, multinomial models, Dirichlet regression, etc all the coefficients in the posterior are ‘relative’ to the reference category. This makes getting a sense of covariate effect strength for each category rather hard to wrap my head around. Marginal_effects / Conditional_effects makes plotting the association easy, but at times I want to directly compare the effects like one would with a simpler regression problem.
Consider the following example:
bind <- function(...) cbind(...)
set.seed(10101)
N <- 20
df <- data.frame(
y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7),
y3 = rbinom(N, 10, 0.9), x = rnorm(N)
) %>%
mutate(
size = y1 + y2 + y3,
y1 = y1 / size,
y2 = y2 / size,
y3 = y3 / size
)
df$y <- with(df, cbind(y1, y2, y3))
fit<-brm(y~x,df,dirichlet(refcat = "y2"),cores=4)
summary(fit)
Family: dirichlet
Links: muy1 = logit; muy3 = logit; phi = identity
Formula: y ~ x
Data: df (Number of observations: 20)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muy1_Intercept -0.28 0.08 -0.44 -0.13 1.00 3793 3180
muy3_Intercept 0.22 0.07 0.08 0.35 1.00 3866 2975
muy1_x -0.02 0.08 -0.17 0.13 1.00 3557 3089
muy3_x -0.04 0.07 -0.18 0.09 1.00 3555 2852
I would like to compare the effect of covariate x on the probability of y1,y2 and y3.
Currently my workaround for doing so is to use the fitted() function with newdata consisting of the min() and max() of the covariate, then manually calculating slopes:
fitted_newdata<-fitted(fit,newdata = newdata,summary = F,nsamples = 1000)
dimnames(fitted_newdata)<-list("Sample"=sprintf("s%d",1:1000),"x"=c("x1","x2"),"Category"=dimnames(fitted_newdata)[[3]])
fitted_newdata %>% as_tibble() %>%
mutate(.sample=1:1000) %>%
gather(Var,Val,-.sample) %>%
separate(Var,sep = ".y",into = c("x","Category")) %>%
spread(x,Val) %>%
mutate(slope=(x2-x1)/(max(df$x)-min(df$x))) %>%
compare_levels(by = Category,slope,draw_indices = ".sample")%>%
ggplot(aes(y=as.factor(Category),x=slope))+geom_halfeyeh()+ylab("Comparison")+xlab("Slope for x")
Questions
- Is my fitted() way a complete abomination or is this valid?
- If indeed it is, is there a better way to compare the effect size/betas for the covariates INCLUDING for the reference category?
Thanks, as always
Hugo