I rewrote the Rats model (first example model in Volume I of the BUGS examples) to get a feel for what doing it the right way looks like. I’m trying to get it to pass simulationbased calibration as written (this is the revision of the CookGelmanRubin diagnostics):
There are three problems I see.

The model itself is so ugly as to be unreadable with all the variable scaling. The multiplier/offset thing’s going to save a lot of headaches.

It takes hours to run Stan with tight enough settings and with enough iterations that it passes SBC as written. This was with 1000 saved iterations at thinning 5 and 256 simulations. I set up chisquared tests for uniformity.
 I could use some guidance on iterations, thinning, and tests for uniformity here. The Talts et al. arXiv paper just has you eyeball histograms! I need something that can be automated. This is going to have to be put on a cluster if we want to run them all in some kind of development cycle.
 My R code’s terrible, but all the time’s spent in Stan fitting. This relatively simple hierarchical model takes a minute or two to fit on my notebook when I crank adapt_delta up to 0.99. Part of the problem is probably the superwide priors. I tightened them considerably from BUGS, but didn’t want to go below normal(0, 2 * theta) for a weakly informative prior on theta. The signs aren’t controlled on any of the variables.

It’s going to wind up taking at least a couple hours per model to revise, find the right settings, and validate. There are hundreds of example models, so this is something like a 500 hour project to revise all the models. That’s 1/4 of a fulltime year doing nothing else. So nobody should be holding their breath for this.
 It may be better to start with zero and add ones that work rather than trying to deal with an unvalidating and sprawling set of 200 examples that we don’t even know if Stan fits properly with NUTS.
The “good” Rats program.
data {
int<lower=0> N;
int<lower=0> T;
vector[T] x;
matrix[N, T] y;
}
transformed data {
real xbar = mean(x);
vector[T] x_ctr = (x  xbar);
}
parameters {
vector[N] alpha_std;
real mu_alpha_std;
real<lower = 0> sigma_alpha_std;
vector[N] beta_std;
real mu_beta_std; // beta.c in original
real<lower = 0> sigma_beta;
real<lower = 0> sigma_y_std;
}
transformed parameters {
real mu_alpha = mu_alpha_std * 500;
real sigma_alpha = sigma_alpha_std * 50;
real mu_beta = mu_beta_std * 12;
real sigma_y = sigma_y_std * 12;
vector[N] alpha = mu_alpha + sigma_alpha * alpha_std;
vector[N] beta = mu_beta + sigma_beta * beta_std;
}
model {
mu_alpha_std ~ normal(0, 1);
sigma_alpha_std ~ normal(0, 1);
alpha_std ~ normal(0, 1);
mu_beta_std ~ normal(0, 1);
sigma_beta ~ normal(0, 1);
beta_std ~ normal(0, 1);
sigma_y_std ~ normal(0, 1);
for (t in 1:T)
y[ , t] ~ normal(alpha + beta * x_ctr[t], sigma_y);
}
generated quantities {
real alpha0 = mu_alpha  xbar * mu_beta;
}
Not good. Now here’s my SBC code (spoiler alert: not good). I don’t know how to automate the variable names to check here. I just pulled out the five they talked about in the BUGS writeup and monitored them by hand.
I just copied the data over—it should be sourced into scope or even better, put in an encapsulated scope rather than polluting the top level.
N < 30
T < 5
x < c(8, 15, 22, 29, 36)
xbar < 22
x_ctr < x  xbar
rats_model < stan_model("rats.stan")
M < 256
K < 5
ranks < matrix(NA, K, M)
names < c("mu_alpha", "sigma_alpha", "mu_beta", "sigma_beta", "sigma_y")
for (m in 1:M) {
print(sprintf("m = %d", m), quote = FALSE)
mu_alpha < rnorm(1, 0, 500)
sigma_alpha < abs(rnorm(1, 0, 50))
alpha < rnorm(N, mu_alpha, sigma_alpha)
mu_beta < rnorm(1, 0, 12)
sigma_beta < abs(rnorm(1, 0, 1))
beta < rnorm(N, mu_beta, sigma_beta)
sigma_y < abs(rnorm(1, 0, 12))
y < matrix(NA, N, T)
for (t in 1:T)
y[1:N, t] < rnorm(N, alpha + beta * x_ctr[t], sigma_y)
data_sim < list(N = N, T = T, x = x, y = y)
fit < sampling(rats_model, data = data_sim,
chains = 1, warmup = 2000, iter=7000,
thin = 5,
control = list(adapt_delta = 0.99),
seed = 1234, refresh = 0)
fit_ss < extract(fit)
ranks[1, m] < rank(c(mu_alpha, fit_ss$mu_alpha))[1]
ranks[2, m] < rank(c(sigma_alpha, fit_ss$sigma_alpha))[1]
ranks[3, m] < rank(c(mu_beta, fit_ss$mu_beta))[1]
ranks[4, m] < rank(c(sigma_beta, fit_ss$sigma_beta))[1]
ranks[5, m] < rank(c(sigma_y, fit_ss$sigma_y))[1]
}
library(spgs)
for (k in 1:K) {
p_value < chisq.unif.test(ranks[k, 1:M] / 1001)$p.value
print(sprintf("%s: p value = %10.8f",
names[k], p_value),
quote = FALSE)
}
Also, I used library sbgs
which was the first hit on Google for chisquared test in R that worked on data.
Here’s the output:
[1] mu_alpha: p value = 0.83118519
[1] sigma_alpha: p value = 0.28289316
[1] mu_beta: p value = 0.06181241
[1] sigma_beta: p value = 0.93033231
[1] sigma_y: p value = 0.87538122
So it all looks OK.
P.S. Here’s the Rats program revised with multiplier/offset scaling The variable declarations become clunky and hard to read, but the model block is much more readable.
data {
int<lower=0> N;
int<lower=0> T;
vector[T] x;
matrix[N, T] y;
}
transformed data {
real xbar = mean(x);
vector[T] x_ctr = (x  xbar);
}
parameters {
real<multiplier = 500> mu_alpha;
real<lower = 0><multiplier = 50> sigma_alpha;
vector<offset = mu_alpha, multiplier = sigma_alpha>[N] alpha;
real<multiplier = 12> mu_beta;
real<lower = 0><multiplier = 1> sigma_beta;
vector<offset = mu_beta, multiplier = sigma_beta>[N] beta;
real<lower = 0><multiplier = 12> sigma_y;
}
model {
mu_alpha ~ normal(0, 500);
sigma_alpha ~ normal(0, 50);
alpha ~ normal(mu_alpha, sigma_alpha);
mu_beta ~ normal(0, 12);
sigma_beta ~ normal(0, 1);
beta ~ normal(mu_beta, sigma_beta)
sigma_y_std ~ normal(0, 12);
for (t in 1:T)
y[ , t] ~ normal(alpha + beta * x_ctr[t], sigma_y);
}
generated quantities {
real alpha0 = mu_alpha  xbar * mu_beta;
}