Problem "running the model cmdstanr with simulated data in R"

Dear Stan community,

I want to check my model with simulated data in R. To this end, first I generated the data in R and then run my model in the cmdstanr with 50 iterations (init=0, n=310,iterations=50,iter_warmup=4000,iter_sampling=4000, adapt_delta=0.99,max_treedepth=15,chains=1). But, I got the following error during the execution (after iterations=30). It should be noted that the model runs without any problems with a low number of iterations, for example, 5 iterations.
Can @rok_cesnovar and anyone else help me to solve this issue? I will appreciate any recommendations.

"
Model executable is up to date!
Running MCMC with 1 chain…
Chain 1 Rejecting initial value:
Chain 1 Log probability evaluates to log(0), i.e. negative infinity.
Chain 1 Stan can’t start sampling from this initial value.
Chain 1 Initialization failed.
Warning: Chain 1 finished unexpectedly!
Error: No chains finished successfully. Unable to retrieve the draws.
In addition: Warning message:
Error: No chains finished successfully. Unable to retrieve the draws.
".

My stan codes are here:

functions{
  // defines the log spdf
  vector a (vector p,matrix kernel, vector exp_x){
    vector[num_elements(p)] a_; // `a` saved to output
   //{ // `z` not part of output
    vector[num_elements(p)] landa;
    for (k in 1 : num_elements(p)) {
        landa[k] = kernel[k,] * exp_x * p[k];
         }
   a_ = landa / sum(landa); // assign `a`
     for (i in 1:num_elements(p)) {
  if (is_nan(a_[i]) || (a_[i] < 0.0)) {
    reject("a_[",i,"] is undefined");
           }
           }
  return a_;
            }
// defines the log survival
  vector log_S(vector t, vector p, matrix kernel, vector exp_x,int[] county, real shape, vector scale) {
vector[num_elements(t)] log_S_;
vector[num_elements(p)] spdf;
spdf = a (p,kernel, exp_x);
for (i in 1:num_elements(t)){
//log_S[i] = log(spdf[county[i]]) + log(1- ((scale [i]*(t[i])^shape)/ (1+(scale [i]*(t[i])^shape)))); // county is a vector that shows the location of patients with dim = num_elements(t) 
 log_S_[i] = log(spdf[county[i]]) - log1p(scale[i] * (t[i] ^ shape)); // county is a vector that shows the location of patients with dim = num_elements(t) 
}
return log_S_;
}

//defines the log pdf
vector log_f (vector t,vector p, matrix kernel, vector exp_x, int[] county, real shape,vector scale){
vector[num_elements(t)] log_f_;
vector[num_elements(p)] spdf;
spdf = a (p,kernel, exp_x);
for (i in 1:num_elements(t)){
//log_f[i] = log(spdf[county[i]]) + log((scale[i]*shape*t[i]^(shape-1))/((1+(scale[i]*t[i]^shape))^2));
log_f_[i] = log(spdf[county[i]]) + log(scale[i]*shape*t[i]^(shape-1)) - 2* log1p((scale[i]*(t[i])^shape));
}
return log_f_;
}
//defines the log-likelihood
real surv_loglogistic_lpdf(vector t,vector d,vector p, matrix kernel, vector exp_x, int[] county, real shape,vector scale){
vector[num_elements(t)] log_lik;
real prob;
vector[num_elements(t)] ls_s;
vector[num_elements(t)] ls_f;
 for (i in 1 : num_elements(t)) {
    if (t[i] <= 0.0) {
      reject("t[", i, "] is ", t[i]);
    }
  }
ls_s = log_S( t, p,  kernel,  exp_x, county,  shape, scale);
ls_f = log_f( t, p,  kernel,  exp_x, county,  shape, scale);
log_lik = d .* log_f(t, p,  kernel,  exp_x, county,  shape, scale)+ (1 - d) .* log_S(t, p,  kernel,  exp_x, county,  shape, scale);
for (i in 1 : num_elements(log_lik)) {
    if (is_inf(log_lik[i]) || is_nan(log_lik[i])) {
      reject("log_lik[", i, "] is ", log_lik[i]);
    }
  }
prob = sum(log_lik);
return prob;
}
}
//data block
data{
  int <lower=0> m;
  int <lower=0> n_grid;
  int <lower=0> n;
  int <lower=0> num;
  int <lower=0> number;
  int <lower =0 , upper = m> county[n];
  vector <lower=0> [m] p;
  matrix [m,n_grid] kernel;
  matrix [n,number] data3;
}
//parameters block
parameters{
  vector [num] beta;
  real <lower=0> sigma;//scale parameter sigma=1/shape
  vector [n_grid] x;
  vector <lower=0> [m] r_county; 
}
// transformed parameters block
transformed parameters{
  vector [num] time_ratio;
vector[n] lambdaa;
real <lower=0> alpha = 1 / sigma;
  vector [n_grid] exp_x;
time_ratio  = exp(beta);
  exp_x = exp(x);
  //linpred = x*beta;//linear predictor
  lambdaa = exp (((-1 * data3[,4:7] * beta) / sigma)+log(r_county [county]));
}
// model block
model{
  target += normal_lpdf(x|0,5);
  target += normal_lpdf(beta | 0, 1000);
  target += inv_gamma_lpdf(sigma | 0.001, 0.001);
  target += normal_lpdf(r_county| 0, 1);
  for (i in 1:m) {
    if (is_nan(r_county[i]) || (r_county[i] < 0.0)) {
      reject("r_county[",i,"] is undefined");
    }
  }
  target += surv_loglogistic_lpdf (data3[,11] | data3[,2],p, kernel,  exp_x, county, alpha, lambdaa);
}
.```

Thank you!

What happens if you set the initial value of the sampling run to the parameter values of the data-generating run?

Thanks a ton for your reply and support @mike-lawrence. Can you explain more, please?
the parameter values of the data-generating run are (beta0=1, beta1=1, beta2=1,beta3=-0.50, scale=0.80,shape=0.90).

here are my codes for running in cmdstanr:

stanfunc<-function(mod,data3,kernel, chains,parallel_chains,init,thin,iter_warmup, iter_sampling ,max_treedepth ,adapt_delta,num,number,m,n_grid){
  file <- file.path( "sim.depen model with spat cov.right.final.20nd Jaunuary.stan")
  mod <- cmdstan_model(file)
  data3=simfun(n=310,f=10,g=150,beta0=1, beta1=1, beta2=1,beta3=-0.50, scale=0.80,shape=0.90)
  county <- data3$id
  n = length(data3$id)
  status <- data3$status
  spatcov=data3$spatcov
  cr <- 1-(sum(status)/n)
  p=c()
  for (j in 1:m){
    p[j] <- sum(data3$status[data3$id==j]) / length(data3$status[data3$id==j])
  }
  spatcovmean=c()
  for (k in 1:m){
    spatcovmean[k] <- sum(data3$spatcov[data3$id==k]) / length(data3$spatcov[data3$id==k])
  }
  da<- list(data3=data3,number=number,kernel=kernel,p=p,n_grid=n_grid,num=num,m=m, county= county, n=n)
  fit <- mod$sample(data = da,chains = chains,parallel_chains = parallel_chains,init=init,thin=thin,iter_warmup = iter_warmup,iter_sampling = iter_sampling,max_treedepth = max_treedepth,adapt_delta = adapt_delta)
  estimate <-  as.data.frame(fit$draws(c("beta","sigma","r_county")))
  spdf1 <- as.data.frame(fit$draws("spdff"))
  estimate.mean <- apply(estimate,2,mean)
  spdf1.mean <- apply(spdf1,2,mean)
return(list("fit"=fit,"p"=p,"cr"=cr,"estimate.mean"=estimate.mean,"spdf1.mean"=spdf1.mean,"spatcovmean"=spatcovmean))
}

codes for running the model with iteratations=50 in cmdstanr:

iterat <- 50
estimate.mean1<- matrix(NA, ncol=36, nrow=iterat)
ID<-rep(NA,iterat)
p1 <- matrix(NA, ncol=31, nrow=iterat)
cr<- rep(NA,iterat)
spatcovmean1 <- matrix(NA, ncol=31, nrow=iterat)
spdf11 <- matrix(NA, ncol=31, nrow=iterat)
boots <- function(iterat,result){
  for (i in 1:iterat){
    print(i)
    result = stanfunc(data3=data3,kernel=kernel , chains=1,parallel_chains = getOption("mc.cores", 1),init=0,thin=10,iter_warmup = 4000,iter_sampling = 4000,max_treedepth = 15,adapt_delta = 0.99,num=4,number=11,m=31,n_grid=354)
    ID[i]=i
    p1[i,] <- result$p
    cr[i] <- result$cr
    spdf11 [i,] <- result$spdf1.mean
    spatcovmean1[i,] <- result$spatcovmean
    estimate.mean1[i,] <- result$estimate.mean
  }
return(list("ID"=ID,"p1"=p1,"cr"=cr,"estimate.mean1"=estimate.mean1,"spdf11"=spdf11,"spatcovmean1"=spatcovmean1))
}
.```



Thank you in advance!

Can you put a line with three backticks (`) at the beginning and end of your code? That will cause it to format here better and make it easier for me to read

Sure @mike-lawrence, I edited my post now.

So here you have you data-simulating function, and presumably all the arguments reflect parameters in your model, so what I’m suggesting is that you try to confirm that your model samples ok at least in the ideal circumstances of supplying these known-true/known-to-generate-valid-data parameter values as initial values to the sampler. If that still throws an initialization error, then something is different between your data-generating simfun() function and the data generating process implied by your model.

1 Like

Thank you very much @mike-lawrence for your support and suggestions, I’ll check it and inform you results.
As above-mentioned, I got the error during the execution (after iterations=30) just when I execute my model with a high number of iterations (iterations=50). But, my model runs well and without any problems with a low number of iterations, for example, 5 iterations.

King regards,
Eisa

Hm, this suggests that your model’s parameterization is fragile such that it’s easy for the sampler to drift into areas of the parameter space that yield NaNs for the log-probability. Try looking at the iterations that yield such values and try to work out where in the computations from the parameters to the lp the numbers go off the rails.

Thank you so much @mike-lawrence for your valuable suggestions. Actually, I’m new to Stan and do not know exactly what to do to solve this problem. Can you check my model, please?

Here are my Stan codes:

functions{
  // defines the log spdf
  vector a (vector p,matrix kernel, vector exp_x){
    vector[num_elements(p)] a_; // `a` saved to output
   //{ // `z` not part of output
    vector[num_elements(p)] landa;
    for (k in 1 : num_elements(p)) {
        landa[k] = kernel[k,] * exp_x * p[k];
         }
   a_ = landa / sum(landa); // assign `a`
     for (i in 1:num_elements(p)) {
  if (is_nan(a_[i]) || (a_[i] < 0.0)) {
    reject("a_[",i,"] is undefined");
           }
           }
  return a_;
            }
// defines the log survival
  vector log_S(vector t, vector p, matrix kernel, vector exp_x,int[] county, real shape, vector scale) {
vector[num_elements(t)] log_S_;
vector[num_elements(p)] spdf;
spdf = a (p,kernel, exp_x);
for (i in 1:num_elements(t)){
//log_S[i] = log(spdf[county[i]]) + log(1- ((scale [i]*(t[i])^shape)/ (1+(scale [i]*(t[i])^shape)))); // county is a vector that shows the location of patients with dim = num_elements(t) 
 log_S_[i] = log(spdf[county[i]]) - log1p(scale[i] * (t[i] ^ shape)); // county is a vector that shows the location of patients with dim = num_elements(t) 
}
return log_S_;
}
//defines the log pdf
vector log_f (vector t,vector p, matrix kernel, vector exp_x, int[] county, real shape,vector scale){
vector[num_elements(t)] log_f_;
vector[num_elements(p)] spdf;
spdf = a (p,kernel, exp_x);
for (i in 1:num_elements(t)){
//log_f[i] = log(spdf[county[i]]) + log((scale[i]*shape*t[i]^(shape-1))/((1+(scale[i]*t[i]^shape))^2));
log_f_[i] = log(spdf[county[i]]) + log(scale[i]*shape*t[i]^(shape-1)) - 2* log1p((scale[i]*(t[i])^shape));
}
return log_f_;
}
//defines the log-likelihood
real surv_loglogistic_lpdf(vector t,vector d,vector p, matrix kernel, vector exp_x, int[] county, real shape,vector scale){
vector[num_elements(t)] log_lik;
real prob;
vector[num_elements(t)] ls_s;
vector[num_elements(t)] ls_f;
 for (i in 1 : num_elements(t)) {
    if (t[i] <= 0.0) {
      reject("t[", i, "] is ", t[i]);
    }
  }
ls_s = log_S( t, p,  kernel,  exp_x, county,  shape, scale);
ls_f = log_f( t, p,  kernel,  exp_x, county,  shape, scale);
log_lik = d .* log_f(t, p,  kernel,  exp_x, county,  shape, scale)+ (1 - d) .* log_S(t, p,  kernel,  exp_x, county,  shape, scale);
for (i in 1 : num_elements(log_lik)) {
    if (is_inf(log_lik[i]) || is_nan(log_lik[i])) {
      reject("log_lik[", i, "] is ", log_lik[i]);
    }
  }
prob = sum(log_lik);
return prob;
}
}
//data block
data{
  int <lower=0> m;
  int <lower=0> n_grid;
  int <lower=0> n;
  int <lower=0> num;
  int <lower=0> number;
  int <lower =0 , upper = m> county[n];
  vector <lower=0> [m] p;
  matrix [m,n_grid] kernel;
  matrix [n,number] data3;
}
//parameters block
parameters{
  vector [num] beta;
  real <lower=0> sigma;//scale parameter sigma=1/shape
  vector [n_grid] x;
  vector <lower=0> [m] r_county; 
}
// transformed parameters block
transformed parameters{
  vector [num] time_ratio;
vector[n] lambdaa;
real <lower=0> alpha = 1 / sigma;
  vector [n_grid] exp_x;
time_ratio  = exp(beta);
  exp_x = exp(x);
  //linpred = x*beta;//linear predictor
  lambdaa = exp (((-1 * data3[,4:7] * beta) / sigma)+log(r_county [county]));
}
// model block
model{
  target += normal_lpdf(x|0,5);
  target += normal_lpdf(beta | 0, 1000);
  target += inv_gamma_lpdf(sigma | 0.001, 0.001);
  target += normal_lpdf(r_county| 0, 1);
  for (i in 1:m) {
    if (is_nan(r_county[i]) || (r_county[i] < 0.0)) {
      reject("r_county[",i,"] is undefined");
    }
  }
  target += surv_loglogistic_lpdf (data3[,11] | data3[,2],p, kernel,  exp_x, county, alpha, lambdaa);
}
.```

By iterations, here, you don’t mean iterations of the HMC sampler but rather repeated attempts to simulate data and fit the model in a loop in your R code. What’s happening here is that with some small but finite probability your model fails to initialize, and then Stan errors. This doesn’t necessarily call for a reparameterization, but requires that you ensure that your inits don’t land in regions of parameter space that are so improbable that Stan has numerical issues computing the target density.

So a big +1 to

Thanks a ton for your great suggestions and reply @jsocolar. Yes, I mean iterations are repeated attempts to simulate data and fit the model in a loop in your R code. So, are the parameter values of the data-generating (beta0=1, beta1=1, beta2=1,beta3=-0.50, scale=0.80,shape=0.90) good for initial values based on the @mike-lawrence suggestions? Or do you have other recommendations to solve these issues?

What a “good” initialization is in the context of this simulation depends on what you are trying to use this simulation for. For example, if you are testing the ability of the model to recover known parameters in order to later use the model on “real” data, then you should use whatever initialization strategy you would ultimately use on the real data. What would you do, on real data, if the default inits failed? Whatever that is, do that in your testing. Note that this may be difficult to automate.

Big thanks for supporting and helping @jsocolar. My purpose in the simulation is to test the ability of the model to recover known parameters. I try to run the model for simulated data with default inits and init=0 in cmdstanr.

I want to inform you that I tried to run my model for simulated data with 50 iterations using the parameter values of the data-generating as initial values(beta0=1, beta1=1, beta2=1,beta3=-0.50, scale=0.80,shape=0.90), however, again I got the same previous error as below:

"
Model executable is up to date!
Running MCMC with 1 chain…
Chain 1 Rejecting initial value:
Chain 1 Log probability evaluates to log(0), i.e. negative infinity.
Chain 1 Stan can’t start sampling from this initial value.
Chain 1 Initialization failed.
Warning: Chain 1 finished unexpectedly!
Error: No chains finished successfully. Unable to retrieve the draws.
In addition: Warning message:
Error: No chains finished successfully. Unable to retrieve the draws.
".

If you have any other recommendations or suggestions to solve this problem@mike-lawrence and @jsocolar, I would be appreciated it.

Can @rok_cesnovar or any else help me to tackle this problem in cmdstanr?

Thank you in advance!