Hi,
I’ve been trying to fix initialisation errors in a wiener
family model, where there are multiple constraints (such as the non-decision time being lower than the smallest observed outcome).
However, I noticed that my initialisation values were seemingly not passed to Stan. I’ve identified this as resulting from a silent error when passing initial values for real
Stan parameters as lists, rather than single values. The passed values are simply ignored. Making this mistake the other way around (specifying single inits for vector
parameters, even if of length 1) results in a dimension error.
To try and replicate this I set up a toy problem:
library(brms)
b <- c(2, 0.75)
x <- rnorm(200)
group <- as.integer(rbernoulli(200))
y <- rnorm(200, mean = 1 + group + x)
dat1 <- data.frame(x, y, group)
formula <- bf(y ~ 1 + group, sigma ~ group, family=brmsfamily("gaussian", link_sigma="identity"))
write(make_stancode(formula, data = dat1), file = "test-stancode.stan")
standat <- make_standata(formula, dat1)
This produces the following Stan code:
// generated with brms 2.13.5
functions {
}
data {
int<lower=1> N; // number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int<lower=1> K_sigma; // number of population-level effects
matrix[N, K_sigma] X_sigma; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
int Kc_sigma = K_sigma - 1;
matrix[N, Kc_sigma] Xc_sigma; // centered version of X_sigma without an intercept
vector[Kc_sigma] means_X_sigma; // column means of X_sigma before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
for (i in 2:K_sigma) {
means_X_sigma[i - 1] = mean(X_sigma[, i]);
Xc_sigma[, i - 1] = X_sigma[, i] - means_X_sigma[i - 1];
}
}
parameters {
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
vector[Kc_sigma] b_sigma; // population-level effects
real Intercept_sigma; // temporary intercept for centered predictors
}
transformed parameters {
}
model {
// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;
// initialize linear predictor term
vector[N] sigma = Intercept_sigma + Xc_sigma * b_sigma;
// priors including all constants
target += student_t_lpdf(Intercept | 3, 1.4, 2.5);
target += student_t_lpdf(Intercept_sigma | 3, 0, 2.5);
// likelihood including all constants
if (!prior_only) {
target += normal_lpdf(Y | mu, sigma);
}
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
// actual population-level intercept
real b_sigma_Intercept = Intercept_sigma - dot_product(means_X_sigma, b_sigma);
}
If I now try to run this while providing initial values that I know will violate constraints, I noticed that initial values for real
parameters are ignored if they are lists:
init_fun <- function() {
list(
Intercept = 0,
b = list(0),
Intercept_sigma = list(-20),
b_sigma = list(0)
)
}
rstan::stan("test-stancode.stan", init = init_fun, data = standat)
Sampling will proceed as normal, even though I would have expected it to either:
- throw an error stating that
real
parameters can not be passed lists as initial values - somehow use the value provided, which would lead sampling to fail
This is in contrast to when supplying values that should be a list but are passed as single values instead:
init_fun <- function() {
list(
Intercept = 0,
b = 0,
Intercept_sigma = list(-20),
b_sigma = list(0)
)
}
rstan::stan("test-stancode.stan", init = init_fun, data = standat)
Which produces the output:
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=b; dims declared=(1); dims found=()
[1] "Error in sampler$call_sampler(args_list[[i]]) : "
[2] " mismatch in number dimensions declared and found in context; processing stage=parameter initialization; variable name=b; dims declared=(1); dims found=()"
It would be great if a similar error could be thrown for the first example I posted, as silently ignoring mismatching value dimensions makes it very hard to debug a program.
PS: For reference, I initially incorrectly raised this as an issue at the brms repository: https://github.com/paul-buerkner/brms/issues/1015