# Question about bulk and tail ESS

I am using a translated and scaled simplex as described in section 1.7 of the Stan Users Guide (v2.24) to center coefficients in a relatively simple multivariate response model. Specifically,

``````parameters {
...
simplex[N_forms] beta_raw[N_trait];
vector[N_trait] beta_scale;

}

transformed parameters {
vector[N_trait] mu_form[N_forms];

for (i in 1:N_trait) {
for (j in 1:N_forms) {
mu_form[j][i] = beta_scale[i]*(beta_raw[i][j] - 1.0/N_forms);
}
}
...
}

model {
...
for (i in 1:N_trait) {
beta_raw[i] ~ dirichlet(one);
beta_scale[i] ~ normal(0.0, 1.0);
}
...
}

``````

I am combine `mu_form` with another linear term to model multivariate mean vectors, but what Iâ€™m really interested in is the covariance/correlation matrix associated with those vectors.

My code runs fine, and the diagnostics look good, except that I get a warning about a small bulk and tail ESS. When I examine bulk and tail ESS for each of the parameters in the model, I discover that the warning is because the bulk and tail ESS for `beta_raw` and `beta_scale` are very small, i.e., 6-20. The bulk and tail ESS for `mu_form` are a bit small (< 300), and the bulk and tail ESS for the other linear term is also small (150 or so).

BUT the bulk and tail ESS for all of the parameters Iâ€™m interested in are all > 400.

Do I need to worry about the small bulk and tail ESS for my â€śnuisanceâ€ť parameters, or am I safe to ignore them? My understanding is that the mean/median and quantiles from my â€śnuisanceâ€ť parameters may be unreliable, but since Iâ€™m not interested in estimating them, do I need to worry about the warning?

Kent

You should worry about them, unfortunately. Even if these are the nuisance parameters, youâ€™re still integrating out so you want to sample them well.

I just went over to have a look at section 1.7. There are a bunch of other ways to do a sum to zero. Itâ€™s probably worth trying those other parameterizations because they might behave quite a bit differently.

Damn! I was worried that was probably the case. Iâ€™ll try one of the other formulations and hope that I have better luck.

Thank you for the quick response.

Kent

Quick update: Contrast coding seems to work pretty well.

``````parameters {
...
vector[N_forms-1] beta_raw[N_trait];
...
}

transformed parameters {
vector[N_trait] mu_form[N_forms];

for (i in 1:N_trait) {
for (j in 1:(N_forms-1)) {
mu_form[j][i] = beta_raw[i][j];
}
mu_form[N_forms][i] = 0.0;
}
...
}

model {
...
for (i in 1:N_trait) {
beta_raw[i] ~ normal(0.0, 1.0/sqrt(N_forms));
}
...
}
``````

No warnings about low bulk or tail ESS for any of the parameters. Given how well that seems to work, I think Iâ€™ll try the K-1 degrees of freedom approach and see whether itâ€™s the sum-to-zero constraint thatâ€™s the problem. (Sum-to-zero is a bit easier to interpret in my application.)

Kent

4 Likes

Further update: Sum to zero using the K-1 degrees of freedom approach works just fine. Is there any reason that K-1 would general work better than Dirichlet, or is it specific to the data/problem?

Kent

2 Likes

Iâ€™m not sure, but that sounds right. That would at least explain why there are there so many versions of this in the manual.

1 Like