After upgrading to CmdStan 2.3 samples from prior model contain lots of NA

I just upgraded to CmdStan 2.3 and when I try to sample the prior model, I get lots of NA values when I inspect “pred” (generated quantities). This did not happen when I run the same model with 2.29.

I’m not sure if this is a bug or if the new version is just finding a problem with the model that 2.29 was unable to detect.

I’m attaching the model and the R workspace with the data

Model:

data {
  //Dimensions
  int<lower=1> N;
  int<lower=1> N_site;
  int<lower=1> N_plot;
  int<lower=1> N_cover;

  //Response variable
  array[N] int<lower=0> germ;
  
  //Explanatory variables
 
  vector[N] SM;
  vector[N] ST;
  
  array[N] int<lower=1,upper=N_cover> cover_id;

  //group variables
  array[N] int<lower=1> site_id;
  array[N] int<lower=1> plot_id;

}

parameters {
//intercept
real a;

//varying intercepts
vector[N_site] a_site_nc;
real<lower=0> sigma_site;

vector[N_plot] a_plot;

vector[N_cover]  b_sm_nc;
real mu_sm;
real<lower=0>sigma_sm;

vector[N_cover]  b_st_nc;
real mu_st;
real<lower=0>sigma_st;

//BB    
real<lower=0> phi;
}


transformed parameters{
real theta;

vector[N_site] a_site = 0 + sigma_site*a_site_nc;
vector[N_cover] b_st = mu_st+sigma_st*b_st_nc;
vector[N_cover] b_sm = mu_sm+sigma_sm*b_sm_nc;

vector[N] p = inv_logit(a + a_site[site_id]+a_plot[plot_id]+
b_sm[cover_id].*SM+b_st[cover_id].*ST);

theta = phi + 2;
}

model {
//Likelihood

//germ~beta_binomial(6, p * theta, (1-p) * theta);

b_sm_nc~normal(0,1);
mu_sm~normal(0,0.1);
sigma_sm~exponential(3);

b_st_nc~normal(0,1);
mu_st~normal(0,0.1);
sigma_st~exponential(3);


//priors for intercepts
a~normal(0,0.1);
a_site_nc~normal(0,1);
sigma_site~exponential(3);
a_plot~normal(0,0.1);

//bb
phi ~ exponential(1);


}
generated quantities {
vector[N] pred;
for (i in 1:N) {
pred[i] = beta_binomial_rng(6, p[i]*theta, (1-p[i])*theta);
}
}

Runn the code

library(cmdstanr)

mod<-cmdstan_model("prior_model.stan")

fit_prior<-mod$sample(data=data_list,
                      chains=4,
                      parallel_chains = 4)
prior_samples<-fit_prior$draws(c("pred"),format="matrix")

workspace.RData (3.6 KB)
prior_model.stan (1.5 KB)

In 2.29, are the values just 0? We made a change where if there is an error during the generated quantities block the results should be NA, when previously they were zero

Does fit_prior$output(1) show anything out of the ordinary?

I see several Exceptions like like these ones:

Iteration: 1001 / 2000 [ 50%]  (Sampling)
Exception: beta_binomial_rng: Second prior sample size parameter is 0, but must be positive finite! (in '/tmp/RtmpPVoOto/model-5bea24f143bf.stan', line 90, column 0 to column 59)
Exception: beta_binomial_rng: Second prior sample size parameter is 0, but must be positive finite! (in '/tmp/RtmpPVoOto/model-5bea24f143bf.stan', line 90, column 0 to column 59)
Exception: beta_binomial_rng: Second prior sample size parameter is 0, but must be positive finite! (in '/tmp/RtmpPVoOto/model-5bea24f143bf.stan', line 90, column 0 to column 59)
Exception: beta_binomial_rng: Second prior sample size parameter is 0, but must be positive finite! (in '/tmp/RtmpPVoOto/model-5bea24f143bf.stan', line 90, column 0 to column 59)

Version 2.29. also returned these Exceptions, but I ignored them because:

fit_prior$diagnostic_summary()
$num_divergent
[1] 0 0 0 0

$num_max_treedepth
[1] 0 0 0 0

$ebfmi
[1] 1.0914897 1.0093226 0.9314585 1.0124989

When I try to sample from the posterior (I’m attaching the model to this reply) I also see several Exceptions, but no NA values.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: beta_binomial_lpmf: Second prior sample size parameter[29] is 0, but must be positive finite! (in '/tmp/RtmpAEt0Ss/model-24051fefd2d2.stan', line 65, column 0 to column 48)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration:  100 / 2000 [  5%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: beta_binomial_lpmf: First prior sample size parameter[1] is 0, but must be positive finite! (in '/tmp/RtmpAEt0Ss/model-24051fefd2d2.stan', line 65, column 0 to column 48)
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  300 / 2000 [ 15%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)

What should I do about he NA values? Ignore them? Re parameterize the model? If so, what should I change?

model.stan (1.5 KB)

I think how you treat them is dependent on your use case. When sampling, you don’t see NA values because the proposal is rejected. In generated quantities, the proposal has already been accepted, so generate quantities doesn’t have the ability to rewind time and reject it, hence outputting NA.

I think the warning from sampling is probably good advice here, too. If they’re sporadic, you can probably just ignore them, but if you get many of them it is most likely indicative of a problem you should solve at the modeling level through reparameterization, additional constraints (if they make logical sense), etc

1 Like