Dirichlet regression - understand missing slope parameter

I see that in brms the dirichlet regression is formulated so:

``````parameters {
vector[Kc_muy2] b_muy2;  // population-level effects
real Intercept_muy2;  // temporary intercept for centered predictors
vector[Kc_muy3] b_muy3;  // population-level effects
real Intercept_muy3;  // temporary intercept for centered predictors
real<lower=0> phi;  // precision parameter
}
model {
// initialize linear predictor term
vector[N] muy2 = Intercept_muy2 + Xc_muy2 * b_muy2;
// initialize linear predictor term
vector[N] muy3 = Intercept_muy3 + Xc_muy3 * b_muy3;
// linear predictor matrix
vector[ncat] mu[N];
for (n in 1:N) {
mu[n] = [0, muy2[n], muy3[n]]';
}
// priors including all constants
target += student_t_lpdf(Intercept_muy2 | 3, 0, 10);
target += student_t_lpdf(Intercept_muy3 | 3, 0, 10);
target += gamma_lpdf(phi | 0.01, 0.01);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += dirichlet_logit_lpdf(Y[n] | mu[n], phi);
}
}
}
``````

I guess both intercept and slope are 0 with no uncertainty.

• Can we have information about the distribution of the first component?
• Can we use the other components to estimate uncertainty of the first?

This because I would like to know the intercept of which components overlap (to a certain degree)

• Last question, if we constrain with a sum-to-zero (intercept and slope) would it be a fundamentally different model, with different interpretation of the parameters?

Thanks a lot!

From `muy2, muy3` imaginary `muy1` has been already subtracted.

`muy2 = (Intercept_muy2_star - Intercept_muy1_star) + (Xc_muy2_star - Xc_muy1_star) * b_muy2`

with `(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2;`
and `(Xc_muy2_star - Xc_muy1_star) = Xc_muy2`.

and `muy3` analog and `muy1 = 0`

I think so, we have 2 equations and 2 unknowns.

The estimated parameters are different in sum-to-zero case, because
`muy1 = - Intercept_muy2 - Intercept_muy3 - Xc_muy3 * b_muy3 - Xc_muy2 * b_muy2`

Itâ€™s a matter of the hyper-priors of the estimated parameters.

1 Like

Here Xc is the design matrix, and b the coefficients.

Donâ€™t we have 5 equations and 6 unknowns (the star parameters)?

• ` (Intercept_muy2 + Xc_muy2 * b_muy2) + (Intercept_muy3 + Xc_muy3 * b_muy3) = 1`

• `(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2`

• `(b_muy2_star - b_muy1_star) = b_muy2`

• `(Intercept_muy3_star - Intercept_muy1_star) = Intercept_muy3`

• `(b_muy3_star - b_muy1_star) = b_muy3`

(In a sense I would expect that the uncertainty of the four initial parameters would be reduced because â€śspreadâ€ť among 6.)

Interesting comment, could you elaborate?

We are comparing` muy2` with `muy1`. (`muy3` analog) see dummy coding in:
https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#reverse
and deviation coding refering to the sum-to-zero constraint.

We are estimating something like:
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2 ~ student_t(3, 0, 10)

If the hyper-prior is informative, then different coding-schemes results in different posterior distributions.

1 Like