I’m just learning Stan so I set up a simple model. Regression with 2 variables, everything simulated from standard normals. The prior is normal with specified mean and variance and likelihood is also normal.
After 4,000 iterations with 1,000 warm-up iterations, I get errors on R-Hat being too high (1.84), and the sample sizes being too low. Why is that? This seems like a trivial model with very well behaved data.
# Random X
X <- matrix(rnorm(6000 * 2), ncol = 2)
cov_betas_L <- t(X) %*% X
cov_betas_L <- t(chol(cov_betas_L))
# Sum the columns of X, add noise
y <- X %*% rep(1, ncol(X)) + rnorm(nrow(X))
y <- y[, 1]
prior_betas <- c(-2, 2)
stan_data <- list(N = nrow(X), M = ncol(X), X = X, y = y,
prior_betas = prior_betas, prior_L = cov_betas_L)
data {
int<lower = 1> N;
int<lower = 1> M;
vector[N] y;
matrix[N, M] X;
cholesky_factor_cov[M, M] prior_L;
vector[M] prior_betas;
parameters {
real alpha;
vector[M] beta_z;
real<lower = 0> y_sigma;
model {
vector[M] beta = prior_betas + prior_L * beta_z;
beta_z ~ normal(0, 1);
y ~ normal(X * beta, y_sigma);
The errors are:
Warning messages:
1: There were 4352 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is 1.84, indicating chains have not mixed.
Running the chains for more iterations may help. See
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See