Some subchain has no samples in Rstan

Hi,

Recently I am testing my Stan file by running some fake data simulation, and sometimes I got the following error:

error_test3 <- stan(file = "testing_3.stan", data = data_test3,refresh=0, cores=5,iter = 500,chains=6)

here are whatever error messages were returned

[[1]]
Stan model 'testing_3' does not contain samples.

[[2]]
Stan model 'testing_3' does not contain samples.

[[3]]
Stan model 'testing_3' does not contain samples.

Warning messages:
1: In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :
  '-E' not found
2: In .local(object, ...) :
  some chains had errors; consider specifying chains = 1 to debug
3: There were 748 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them. 
4: Examine the pairs() plot to diagnose sampling problems
 
5: The largest R-hat is 2.89, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 

From my undertstanding of the above error messgae, the Stan file itself it ‘correct’ since 3 out f 6 chains actually got some returned values. But I am still confused why three of the chains did not contain any values, or is it becuase my Stan file itself it wrong?

Thx!

I think in rstan when running multiple chains with multiple cores sometimes the error or warning messages aren’t fully printed. If you try with cores=1 do you see more information in the error messages? If your model is slow and changing to cores=1 is going to be a pain then you could try changing to cores=1 but only running for a few iterations.

The error message looks like:

[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”
[2] “In addition: Warning message:”
[3] “In system(paste(CXX, ARGS), ignore.stdout = TRUE, ignore.stderr = TRUE) :”
[4] " ‘-E’ not found"
[1] “error occurred during calling the sampler; sampling not done”

The tricky thing is that when I specify cores = 5, at least I could get some draws (although some chains are empty) but when I specify cores = 1, the above error message comes out and no draws is returned.

That suggests that it had problems starting the chains. Sometimes this is because the model is difficult and you need to manually specify initial values (usually you can just let Stan generate them), but since you’re also getting warnings about divergences and high R-hat values from the chains that finished that makes me think that there’s something wrong in the Stan program. It seems to be syntactically correct since at least a few of the chains are running, but the model may have problems that are causing poor sampling in some chains and preventing the others from initializing properly.

If the model is really complicated then you may need to simplify it until the problems go away and then build it back up trying to identify where things start to go wrong. That’s pretty much always a good strategy if the model is complicated.

If you can share you code and data so we could try running it then we may be able to give much more specific advice.

Hi jonah,

Happy New Year and sorry for my late reply. Recently I have tried to debug it but still got the same error where some of the chains contain no sample. The program focused on modeling mixture distribution of normals and the simplified version of code is like following, where g(), h(), f(), k() are known functions of parameters:

data {
  int<lower=0> N1;
  int<lower=0> N2;
  vector[N1] y_1; 
  vector[N2] y_2; 
  vector[N2] y_3; 
}

parameters {
  real<lower=0> beta_1;
  real<lower=0> beta_2; 
  real<lower=0> beta_3;
  real phi_1;
  real phi_2;
  real phi_3;
  real<lower=0> sigma_y;
  real mu; 
  real<lower=0> sigma; 
  real mu_ex; 
  real<lower=0> sigma_ex; 
  real lambda;
}

model {
  beta_1 ~ lognormal(0,100);
  beta_2 ~ lognormal(0,100);
  beta_3 ~ lognormal(0,100);
  phi_1 ~ lognormal(mu_ex,sigma_ex); 
  phi_2 ~ lognormal(mu,sigma);
  phi_3 ~ lognormal(mu,sigma);
  mu_ex ~ normal(0,100); 
  sigma_ex ~ uniform(0,100);
  mu ~ normal(0,100); 
  sigma ~ uniform(0,100);
  lambda ~ uniform(0,1);
  sigma_y ~ uniform(0,100);
  for (i in 1:N1){
    y_1[i] ~ normal(g(beta_1, beta_2, beta_3, phi_1), sigma_y*h(beta_1, beta_2, beta_3, phi_1));
  }
  for (n in 1 : N2){
    target += log_sum_exp(log(lambda) + normal_lpdf(y_2[n]|f(beta_1, beta_2, beta_3), sigma_y*k(beta_1, beta_2, beta_3)),
                          log(1 - lambda) + normal_lpdf(y_2[n]|g(beta_1, beta_2, beta_3, phi_2), sigma_y*h(beta_1, beta_2, beta_3, phi_2));
    target += log_sum_exp(log(lambda) + normal_lpdf(y_3[n]|f(beta_1, beta_2, beta_3), sigma_y*k(beta_1, beta_2, beta_3)),
                          log(1 - lambda) + normal_lpdf(y_3[n]|g(beta_1, beta_2, beta_3, phi_3), sigma_y*h(beta_1, beta_2, beta_3, phi_3));
   
  }
}

Since y_1 has no mixture components, I just used the ‘~’ sampling statement in Stan, but for y_2 and y_3 we have mixture components so I used ‘target’ statement in Stan. The data I used is just some simulated fake data to test whether the code is runable and the returned results showed same errors like (eg. 6 out of 20 total chains does not contain samples) :

[[1]]
Stan model 'try' does not contain samples.

[[2]]
Stan model 'try' does not contain samples.

[[3]]
Stan model 'try' does not contain samples.

[[4]]
Stan model 'try' does not contain samples.

[[5]]
Stan model 'try' does not contain samples.

[[6]]
Stan model 'try' does not contain samples.

(The true model is more complicated involving more hyperprarameters in the mean/standard deviation of the mixture normals, but here I just tried to make it simple to find what’s going wrong. Also the observed data size is relatively small, eg. N1 = 8 and N2 = 10)

Thanks so much for your help!

I have made some more tires to debug and found the following:

  1. If I change
for (i in 1:N1){
   y_1[i] ~ normal(g(beta_1, beta_2, beta_3, phi_1), sigma_y*h(beta_1, beta_2, beta_3, phi_1));
 }

to

for (i in 1:N1){
   target += normal_lpdf(y_1[i] | g(beta_1, beta_2, beta_3, phi_1), sigma_y*h(beta_1, beta_2, beta_3, phi_1))
 }

the results seem to be improved (but there are still many chains does not contain samples)

  1. When I deleted the mixture normal specification and used only one normal densitiy for y_2 and y_3, the issue also got improved.

Not sure whether these two findings will be helpful.