Understanding the parameters of a hierarchical Dirichlet regression

Dear all,

Sorry if my question seems basic and not well stated - I am fairly new to working with Brms and only have basic statistical knowledge. Mainly I am struggling to get my model parameters (group and population level) from Dirichlet regression in an interpretable form on the probability scale.
I am running a Dirichlet regression on Brms for a response consisting of a composite over three particle size classes (y1, y2, y3). My predictor is a treatment variable with two levels (treatmentA & treatmentB) depending on group effects (27 groups).

brm(bind(y1,y2,y3) ~Treatment+(1+Treatment|Group) , data=data, dirichlet(refcat = “y3”),
chains = 4, iter = 4000)

The Dirichlet regression gives me the output shown below. I understand that all population parameters in the output are given as deviations from my reference category y3. However, the main purpose of my model is to quantify the impact of treatment on each dependent variable and to determine how much each group differs from the overall population. Therefore, I would like to extract my population and group parameters, if possible as probabilities and independently from the values of the reference category. Is there by any chance a way to do this?
Is setting refcat=NA the only option to obtain estimates for each of the three responses?

I would be very grateful for any recommendations.
Many thanks!

Family: dirichlet
Links: muy1 = logit; muy2 = logit; phi = identity
Formula: bind(y1, y2, y3) ~ Treatment + (1 + Treatment | Group)
Data: data (Number of observations: 80)
Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
total post-warmup draws = 8000

Group-Level Effects:
~Group (Number of levels: 10)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muy1_Intercept) 0.06 0.04 0.00 0.17 1.00 2095 3906
sd(muy1_TreatmentTreatmentB 0.15 0.10 0.01 0.36 1.00 1405 2306
sd(muy2_Intercept) 0.06 0.04 0.00 0.16 1.00 2571 3929
sd(muy2_TreatmentTreatmentB) 0.19 0.11 0.02 0.43 1.00 1436 2209
cor(muy1_Intercept,muy1_TreatmentTreatmentB) -0.39 0.52 -0.98 0.84 1.00 2538 3051
cor(muy2_Intercept,muy2_TreatmentTreatmentB) -0.32 0.54 -0.98 0.87 1.00 1757 3289

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
muy1_Intercept 2.94 0.05 2.84 3.05 1.00 5660 4115
muy2_Intercept 1.33 0.06 1.22 1.44 1.00 6802 6038
muy1_TreatmentTreatmentB 0.64 0.09 0.46 0.84 1.00 4121 2175
muy2_TreatmentTreatmentB 0.04 0.11 -0.18 0.26 1.00 4115 4141

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi 283.37 33.63 221.02 353.89 1.00 7092 6126

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

2 Likes

Hi, and sorry for the late response. posterior_epred will be yur friend in a model like this. For example:

df <- data.frame(y1 = runif(1000))
df$y2 <- runif(1000, 0, 1 - df$y1)
df$y3 <- 1 - df$y1 - df$y2
df$g <- c("a", "b", "c", "d")

mod <- brm(
  cbind(y1,y2,y3) ~ g, 
  data = df, 
  family = dirichlet(refcat = "y3"),
  backend = "cmdstanr",
  chains = 1
)

a <- posterior_epred(mod, newdata = data.frame(g = c("a", "b", "c", "d")))
dim(a)

#output for the first posterior iteration
a[1, , ]

# Note that probabilities always sum to 1 for every group
rowSums(a[1, ,])