I see that in brms the dirichlet regression is formulated so:
parameters {
vector[Kc_muy2] b_muy2; // population-level effects
real Intercept_muy2; // temporary intercept for centered predictors
vector[Kc_muy3] b_muy3; // population-level effects
real Intercept_muy3; // temporary intercept for centered predictors
real<lower=0> phi; // precision parameter
}
model {
// initialize linear predictor term
vector[N] muy2 = Intercept_muy2 + Xc_muy2 * b_muy2;
// initialize linear predictor term
vector[N] muy3 = Intercept_muy3 + Xc_muy3 * b_muy3;
// linear predictor matrix
vector[ncat] mu[N];
for (n in 1:N) {
mu[n] = [0, muy2[n], muy3[n]]';
}
// priors including all constants
target += student_t_lpdf(Intercept_muy2 | 3, 0, 10);
target += student_t_lpdf(Intercept_muy3 | 3, 0, 10);
target += gamma_lpdf(phi | 0.01, 0.01);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += dirichlet_logit_lpdf(Y[n] | mu[n], phi);
}
}
}
I guess both intercept and slope are 0 with no uncertainty.
- Can we have information about the distribution of the first component?
- Can we use the other components to estimate uncertainty of the first?
This because I would like to know the intercept of which components overlap (to a certain degree)
- Last question, if we constrain with a sum-to-zero (intercept and slope) would it be a fundamentally different model, with different interpretation of the parameters?
Thanks a lot!
From muy2, muy3
imaginary muy1
has been already subtracted.
muy2 = (Intercept_muy2_star - Intercept_muy1_star) + (Xc_muy2_star - Xc_muy1_star) * b_muy2
with (Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2;
and (Xc_muy2_star - Xc_muy1_star) = Xc_muy2
.
and muy3
analog and muy1 = 0
I think so, we have 2 equations and 2 unknowns.
The estimated parameters are different in sum-to-zero case, because
muy1 = - Intercept_muy2 - Intercept_muy3 - Xc_muy3 * b_muy3 - Xc_muy2 * b_muy2
It’s a matter of the hyper-priors of the estimated parameters.
1 Like
Here Xc is the design matrix, and b the coefficients.
Don’t we have 5 equations and 6 unknowns (the star parameters)?
-
(Intercept_muy2 + Xc_muy2 * b_muy2) + (Intercept_muy3 + Xc_muy3 * b_muy3) = 1
-
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2
-
(b_muy2_star - b_muy1_star) = b_muy2
-
(Intercept_muy3_star - Intercept_muy1_star) = Intercept_muy3
-
(b_muy3_star - b_muy1_star) = b_muy3
(In a sense I would expect that the uncertainty of the four initial parameters would be reduced because “spread” among 6.)
Interesting comment, could you elaborate?
We are comparing muy2
with muy1
. (muy3
analog) see dummy coding in:
https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#reverse
and deviation coding refering to the sum-to-zero constraint.
We are estimating something like:
(Intercept_muy2_star - Intercept_muy1_star) = Intercept_muy2 ~ student_t(3, 0, 10)
If the hyper-prior is informative, then different coding-schemes results in different posterior distributions.
1 Like