 # Simple intercept only model with poor chain mixing

Hi everyone!

I have been working with @lizzieinvancouver to understand why simple intercept-only models with two grouping factors would result in poorly mixing chains and low neffective values.

This issue came to our attention when working on a more complex model, but in running a similar model on a much smaller dataset we found the same thing.

The data we used for this example was of scores from the 1932 Olympic Games in Lake Placid. OlympicGames_1932.csv (1.8 KB)

The stan code for our intercept-only model is the following:

``````data {
int<lower=0> N;//number of observation
int<lower=0> n_pair;//number of pairs
int<lower=0> n_judge;//number of judges
int<lower=1,upper = n_pair> pair[N];//vector of pairs
int<lower=1,upper = n_judge> judge[N];//vector of judges
vector[N] y;
}

parameters {
vector[n_pair] gamma;
vector[n_judge] delta;
real<lower=0> mu;
real<lower=0> sigma_gamma;
real<lower=0> sigma_delta;
real<lower=0> sigma_y;
}

transformed parameters {
vector[N] y_hat;

for (i in 1:N)
y_hat[i] = mu + gamma[pair[i]] + delta[judge[i]];
}

model {
mu ~ normal(1,10);
sigma_gamma ~ normal(1,10);
sigma_delta ~ normal(1,10);
sigma_y ~ normal(1,1);

gamma ~ normal(0, sigma_gamma);
delta ~ normal(0, sigma_delta);
//likelihood
y ~ normal(y_hat, sigma_y);
}
generated quantities {
real y_rep[N];
for(n in 1:N)
y_rep[n] = normal_rng(y_hat[n],sigma_y);
}
``````

The R code to run the model is:

``````prog <- subset(dat, criterion == "Program")

prog_data <- list(y = prog\$score,
N = dim(prog),
pair = prog\$pair,
judge = prog\$judge,
n_pair = length(unique(prog\$pair)),
n_judge = length(unique(prog\$judge)))

mdl<- stan('stan/OlympicGames_1932mdl.stan',
data = prog_data)
``````

Even with this simple example, we observe poorly miking chains and neffective values as low as 465:

Any thoughts on what processes are underlying this result and how they might be addressed would be much appreciated.

Thanks!

2 Likes

You don’t seem to have a prior on `mu`.

And on a code/compute efficiency note, you can do:

``vector[N] y_hat = mu + gamma[pair] + delta[judge];``

Also, take a look at the bayesplot package for a variety of diagnostic plots you might find useful. I’d be keen to see a pairs plot for example.

Thanks @mike-lawrence for taking a look at the model.

I have edited the above code, adding a prior for mu as you suggested. The chains are still poorly mixed and some n_eff are still low. The pairs plot looks like this:

I also ran the model using rstanarm and had similar results.

``````mdl.arm <- stan_lmer(score ~ 1 + (1 | judge) + (1 |pair), data = prog)
``````

1 Like

i’m away from my computer, so I can’t check the example data you linked; how many judges and pairs are there? If few, I think a fully-non-centered parameterization of the judge and pair parameters might be best, which is happily easily achieved these days through use of the multiplier declaration:

``````parameters {
…
real<lower=0> sigma_gamma;
real<lower=0> sigma_delta;
vector<multiplier=sigma_gamma>[n_pair] gamma ;
vector<multiplier=sigma_delta>[n_judge] delta ;
}
``````

Note that your first model would be considered partially non-centered because it is non-centered with regards to the intercept mu but not the variance parameters. Which actually highlights that the term non-centered is not a great one 😛

Also, I find your priors a little odd; specifically you seem to be centring them on a mean of one (that’s what `normal(1,…)` does). Is this what you intend?

1 Like