Errors during large-scale testing of Stan

Hi All,

I’ve been conducting large-scale testing of Stan using the models located in the example-models repository (GitHub - stan-dev/example-models: Example models for Stan). The idea is to use this to create a broad and diverse benchmark for the SMC-based algorithms we’re developing.

We expected that there may be issues running the new SMC algorithm on existing Stan models, and indeed we have seen some errors which are unique to these changes. However, we were surprised to find that NUTS was running into a lot of errors as well.

I’ve been looking in detail at these errors today and it seems that the vast majority are from incorrect formatting used within the dump data input files (e.g. some instead contain native R code). However, there’s also 9 models which have initialization errors:

  • example-models/misc/multivariate-probit/probit-multi-good
  • example-models/misc/dlm/fx_factor
  • example-models/knitr/pool-binary-trials/pool
  • example-models/knitr/pool-binary-trials/no-pool
  • example-models/knitr/pool-binary-trials/hier
  • example-models/knitr/pool-binary-trials/hier-logit-centered
  • example-models/knitr/pool-binary-trials/hier-logit
  • example-models/knitr/chapter2/golf1
  • example-models/knitr/golf/golf1

These can be subdivided into 3 apparently independent errors.

probit-multi-good:

Phi: x is nan, but must not be nan!

fx_factor:

gaussian_dlm_obs_lpdf: G[1] is nan, but must be finite!

pool, no-pool, hier, hier-logit-centered, hier-logit:

binomial_logit_lpmf: Successes variable[1] is 4, but must be in the interval [0, 3]

golf1:

binomial_lpmf: Probability parameter[1] is <negative_value>, but must be in the interval [0, 1]

The binomial_logit_lpmf error may be down to the fact that the knitr models use “garbage generated data” that doesn’t make sense in the context (https://github.com/stan-dev/example-models/commit/9917f716b90cc072ae75bb2fd9dc020cc6dd5d4c). However, I’ve played around with the changing the init parameter and the input data for a couple of the other models and I’m still getting the same errors.

Does anyone have an idea why Stan is having initialization problems with these models? Interested to hear everyone’s thoughts!

Phil

1 Like

Often one needs to specify the init_r argument to something smaller than its default value of 2 to get things to initialize, but those hier-logit and golf1 models are just wrong if they throw those errors.

I’ve just tried changing the init argument to 1 for probit-multi-good and that seems to do the trick. However, that leaves fx_factor which still throws the “gaussian_dlm_obs_lpdf: G[1] is nan, but must be finite!” error for various init values. Here’s the model:

// Kalman Filter (Multivariate Form)
// - no missing values
data {
// system matrices
int r;
int T;
matrix[r, T] y;
vector[1] m0;
cov_matrix[1] C0;
}
transformed data {
matrix[1, 1] G;
}
parameters {
vector[r - 1] lambda;
vector<lower=0.0>[r] V;
cov_matrix[1] W;
}
transformed parameters {
matrix[1, r] F;
F[1, 1] <- 1;
for (i in 1:(r - 1)) {
F[1, i + 1] <- lambda[i];
}
}
model {
y ~ gaussian_dlm_obs(F, G, V, W, m0, C0);
}

Seems pretty obvious what is going on now that I look at it. G is declared in the transformed data block but not defined.

1 Like