I see that in brms the dirichlet regression is formulated so:
parameters {
vector[Kc_muy2] b_muy2; // population-level effects
real Intercept_muy2; // temporary intercept for centered predictors
vector[Kc_muy3] b_muy3; // population-level effects
real Intercept_muy3; // temporary intercept for centered predictors
real<lower=0> phi; // precision parameter
}
model {
// initialize linear predictor term
vector[N] muy2 = Intercept_muy2 + Xc_muy2 * b_muy2;
// initialize linear predictor term
vector[N] muy3 = Intercept_muy3 + Xc_muy3 * b_muy3;
// linear predictor matrix
vector[ncat] mu[N];
for (n in 1:N) {
mu[n] = [0, muy2[n], muy3[n]]';
}
// priors including all constants
target += student_t_lpdf(Intercept_muy2 | 3, 0, 10);
target += student_t_lpdf(Intercept_muy3 | 3, 0, 10);
target += gamma_lpdf(phi | 0.01, 0.01);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += dirichlet_logit_lpdf(Y[n] | mu[n], phi);
}
}
}
I guess both intercept and slope are 0 with no uncertainty.
- Can we have information about the distribution of the first component?
- Can we use the other components to estimate uncertainty of the first?
This because I would like to know the intercept of which components overlap (to a certain degree)
- Last question, if we constrain with a sum-to-zero (intercept and slope) would it be a fundamentally different model, with different interpretation of the parameters?
Thanks a lot!
From muy2, muy3 imaginary muy1 has been already subtracted.
muy2 = (Intercept_muy2_star - Intercept_muy1_star) + (Xc_muy2_star - Xc_muy1_star) * b_muy2
with (Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2;
and (Xc_muy2_star - Xc_muy1_star) = Xc_muy2.
and muy3 analog and muy1 = 0
I think so, we have 2 equations and 2 unknowns.
The estimated parameters are different in sum-to-zero case, because
muy1 = - Intercept_muy2 - Intercept_muy3 - Xc_muy3 * b_muy3 - Xc_muy2 * b_muy2
It’s a matter of the hyper-priors of the estimated parameters.
1 Like
Here Xc is the design matrix, and b the coefficients.
Don’t we have 5 equations and 6 unknowns (the star parameters)?
-
(Intercept_muy2 + Xc_muy2 * b_muy2) + (Intercept_muy3 + Xc_muy3 * b_muy3) = 1
-
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2
-
(b_muy2_star - b_muy1_star) = b_muy2
-
(Intercept_muy3_star - Intercept_muy1_star) = Intercept_muy3
-
(b_muy3_star - b_muy1_star) = b_muy3
(In a sense I would expect that the uncertainty of the four initial parameters would be reduced because “spread” among 6.)
Interesting comment, could you elaborate?
We are comparing muy2 with muy1. (muy3 analog) see dummy coding in:
https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#reverse
and deviation coding refering to the sum-to-zero constraint.
We are estimating something like:
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2 ~ student_t(3, 0, 10)
If the hyper-prior is informative, then different coding-schemes results in different posterior distributions.
1 Like