Dirichlet regression - understand missing slope parameter

I see that in brms the dirichlet regression is formulated so:

parameters {
  vector[Kc_muy2] b_muy2;  // population-level effects
  real Intercept_muy2;  // temporary intercept for centered predictors
  vector[Kc_muy3] b_muy3;  // population-level effects
  real Intercept_muy3;  // temporary intercept for centered predictors
  real<lower=0> phi;  // precision parameter
}
model {
  // initialize linear predictor term
  vector[N] muy2 = Intercept_muy2 + Xc_muy2 * b_muy2;
  // initialize linear predictor term
  vector[N] muy3 = Intercept_muy3 + Xc_muy3 * b_muy3;
  // linear predictor matrix
  vector[ncat] mu[N];
  for (n in 1:N) {
    mu[n] = [0, muy2[n], muy3[n]]';
  }
  // priors including all constants
  target += student_t_lpdf(Intercept_muy2 | 3, 0, 10);
  target += student_t_lpdf(Intercept_muy3 | 3, 0, 10);
  target += gamma_lpdf(phi | 0.01, 0.01);
  // likelihood including all constants
  if (!prior_only) {
    for (n in 1:N) {
      target += dirichlet_logit_lpdf(Y[n] | mu[n], phi);
    }
  }
}

I guess both intercept and slope are 0 with no uncertainty.

  • Can we have information about the distribution of the first component?
  • Can we use the other components to estimate uncertainty of the first?

This because I would like to know the intercept of which components overlap (to a certain degree)

  • Last question, if we constrain with a sum-to-zero (intercept and slope) would it be a fundamentally different model, with different interpretation of the parameters?

Thanks a lot!

From muy2, muy3 imaginary muy1 has been already subtracted.

muy2 = (Intercept_muy2_star - Intercept_muy1_star) + (Xc_muy2_star - Xc_muy1_star) * b_muy2

with (Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2;
and (Xc_muy2_star - Xc_muy1_star) = Xc_muy2.

and muy3 analog and muy1 = 0

I think so, we have 2 equations and 2 unknowns.

The estimated parameters are different in sum-to-zero case, because
muy1 = - Intercept_muy2 - Intercept_muy3 - Xc_muy3 * b_muy3 - Xc_muy2 * b_muy2

It’s a matter of the hyper-priors of the estimated parameters.

1 Like

Here Xc is the design matrix, and b the coefficients.

Don’t we have 5 equations and 6 unknowns (the star parameters)?

  • (Intercept_muy2 + Xc_muy2 * b_muy2) + (Intercept_muy3 + Xc_muy3 * b_muy3) = 1

  • (Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2

  • (b_muy2_star - b_muy1_star) = b_muy2

  • (Intercept_muy3_star - Intercept_muy1_star) = Intercept_muy3

  • (b_muy3_star - b_muy1_star) = b_muy3

(In a sense I would expect that the uncertainty of the four initial parameters would be reduced because “spread” among 6.)

Interesting comment, could you elaborate?

We are comparing muy2 with muy1. (muy3 analog) see dummy coding in:
https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#reverse
and deviation coding refering to the sum-to-zero constraint.

We are estimating something like:
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2 ~ student_t(3, 0, 10)

If the hyper-prior is informative, then different coding-schemes results in different posterior distributions.

1 Like