Hi all,
I’ve been working on a multivariate state-space model in RStan and have been running into a strange problem when trying to generate predictions from the model in the generated quantities section. On more than half of the random seeds I’ve tried, when I run the model with generated predictions there has been an error on at least one of the chains that causes that chain(s) to stop sampling entirely. The warning message below appears 5 times:
[1] "Chain 4: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"
[2] "Chain 4: Exception: ordered_logistic: Cut-points is not a valid ordered vector. The element at 4 is inf, but should be greater than the previous element, inf (in 'model371c95768e5_stan_help_code' at line 100)"
[3] "Chain 4: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
[4] "Chain 4: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."
And then an error message saying that the chain does not contain samples pops up. Note that “line 100” above refers to the ordered logistic model itself, not to the generated quantities section. The chains without the error have no issue sampling and the posterior predictive checks look reasonable. Up until now, the model has been converging fine – prior to adding in a generated quantities section I’ve run slight variations of it >100 times and have never seen an error message like this. Additionally, rerunning the model with the same seed but without the generated quantities section does not cause the error and it always converges.
In the only cases of this error message I’ve been able find online, the culprit was the lack of prior on the cutpoints and the solution was to use betanalpha’s induced Dirichlet prior (Ordinal Regression). However, I’ve already been using this prior and it appeared to be working quite well prior to adding in the generated quantities section.
The actual model itself is quite bulky, but I’ve been able to reproduce the same problem on a pared down version of it:
functions {
//prior for cutpoints from betanalpha's blog
//https://betanalpha.github.io/assets/case_studies/ordinal_regression.html
real induced_dirichlet_lpdf(vector c, vector alpha, real phi) {
int K = num_elements(c) + 1;
vector[K - 1] sigma = inv_logit(phi - c);
vector[K] p;
matrix[K, K] J = rep_matrix(0, K, K);
// Induced ordinal probabilities
p[1] = 1 - sigma[1];
for (k in 2:(K - 1))
p[k] = sigma[k - 1] - sigma[k];
p[K] = sigma[K - 1];
// Baseline column of Jacobian
for (k in 1:K) J[k, 1] = 1;
// Diagonal entries of Jacobian
for (k in 2:K) {
real rho = sigma[k - 1] * (1 - sigma[k - 1]);
J[k, k] = - rho;
J[k - 1, k] = rho;
}
return dirichlet_lpdf(p | alpha)
+ log_determinant(J);
}
}
data{
int TT; // total timespan across all datasets
//first dataset
int<lower=1> N1;//number of observations
int y1[N1]; //abundance category for each survey (1-5)
int K1; //ordinal levels
int<lower=0> N_yr1; //number of years
int yr_index[N_yr1]; //index of years
int<lower=1,upper=N_yr1> year_id1[N1]; // vector of year
//second dataset
int<lower=1> N2;
int y2[N2];
int<lower=0> N_yr2;
int<lower=1,upper=N_yr1> year_id2[N2];
int tt_convert2[N_yr2]; //conversion vector for missing years
}
parameters {
//observational model parameters
ordered[K1-1] c1;
real<lower=0> phi2;
//state space component
real<lower = 0> sd_r1;
real<lower = 0> sd_r2;
real<lower = 0> sd_q;
vector[TT] pro_dev;
vector[N_yr1] obs_dev1;
vector[N_yr2] obs_dev2;
real x0;
real a2;
}
transformed parameters{
vector[TT] x; //underlying state
vector[N_yr1] a_yr1; //observed annual estimates dataset 1
vector[N_yr2] a_yr2; //observed annual estimates dataset 2
x[1] = x0 + pro_dev[1]*sd_q;
for(t in 2:TT){
x[t] = x[t-1] + pro_dev[t]*sd_q;
}
for(i in 1:N_yr1){
a_yr1[i] = x[yr_index[i]] + obs_dev1[i]*sd_r1;
}
for(i in 1:N_yr2){
a_yr2[i] = a2 + x[tt_convert2[i]] + obs_dev2[i]*sd_r2; //includes scalar, a2
}
}
model{
//priors for observation models
c1 ~ induced_dirichlet(rep_vector(1, K1), 0);
phi2 ~ inv_gamma(5,5);
//priors for state model
sd_q ~ gamma(2,4);
sd_r1 ~ gamma(2,4);
sd_r2 ~ gamma(2,4);
x0 ~ normal(0,5);
a2 ~ normal(0,5);
pro_dev~ std_normal();
obs_dev1 ~ std_normal();
//observation models
y1 ~ ordered_logistic(a_yr1[year_id1], c1);
y2 ~ neg_binomial_2_log(a_yr2[year_id2], phi2);
}
generated quantities{
vector[N1] y_predict1;
vector[N2] y_predict2;
for (i in 1:N1) y_predict1[i] = ordered_logistic_rng(a_yr1[year_id1[i]],c1);
for (j in 1:N2) y_predict2[j] = neg_binomial_2_log_rng(a_yr2[year_id2[j]], phi2);
}
Strangely, it’s not the ordered_logistic_rng() causing the problems, it seems to be the neg_binomial_2_log_rng(). If I run the model with the same seed and excluding y_predict2, I have no issues. If I instead run the line for y2 above the line for y1 in the model block, the same chain(s) cause problems with an additional warning about the negative binomial model as well:
[1] "Chain 4: Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"
[2] "Chain 4: Exception: neg_binomial_2_log_lpmf: Precision parameter is 0, but must be > 0! (in 'model2fcc17ce6df2_stan_help_code' at line 100)"
[3] "Chain 4: If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
[4] "Chain 4: but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."
And again, the problem goes away if I remove y_predict2 with the neg_binomial_2_log_rng(). Reparameterizing with neg_binomial_2 resulted in the same issue. Unfortunately, I haven’t yet been able to reproduce the error with simulated data, but I am struggling to understand how this issue would arise only when including the negative binomial random number generator, even if the model was otherwise misspecified. Any insights would be much appreciated!