Issues with running survival model in Stan

Isa_Nazar · May 24, 2021, 11:01am

Hi all,
I am trying to run a survival model in Stan. I get the following error message but I cannot understand why this is the case.
Can someone help me with it?
Thank you!
Stan codes:

data {
  int <lower=0> province;
  int <lower=0> n_grid;
  int <lower=0> N;
  vector <lower=0> [province] p;
   matrix [province,n_grid] kernel;
  vector [N] age;
   vector [N] time;
  
  //vector [n_grid] x;
 
  vector[N] censor_status;
}
parameters{
  real beta0;
  real beta;
    real <lower=0> sigma;
    real <lower=0> sigma1;
 vector [n_grid] x;
}
transformed parameters{
  real time_ratio;
  vector [N] lambdaa;
  time_ratio  = exp(beta);
  for(n in 1:N) {
  lambdaa [n] = exp (beta0 + beta*age[n]);
  }
}
model{
  matrix [province,n_grid] landa;
     vector [province] z;
  vector [province] a;
  
  for (j in 1:n_grid){
   target += normal_lpdf(x[j]|0,1);
  }
  target += normal_lpdf(beta0| 0, sigma);
  target += normal_lpdf(beta| 0, sigma1);
  target += cauchy_lpdf(sigma| 0, 5);
  target += cauchy_lpdf(sigma1| 0, 5);
  for (k in 1:province) {
      for (j in 1:n_grid) {
          landa [k,j] = kernel [k,j] * exp (x[j]);
            }
            z[k] = sum(landa[k,]) * p[k];
            a[k] = z[k]/sum(z);
        }
  for (k in 1:province){
    for (i in 1:N) {
      target += (censor_status [i] * a[k] * lambdaa [i] * exp( -1 * lambdaa [i] * time[i])) + ((1 - censor_status [i])*a[k]*exp(-1 * lambdaa [i] * time[i]));
    }
  }
}

Error:

Stan can't start sampling from this initial value.
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done

EDIT: @maxbiostat edited this post for syntax highlighting.

nhuurre · May 24, 2021, 11:51am

The initialization error is caused by incorrect assignment to a.
You cannot compute sum(z) inside the loop that initializes z; you have to wait until after every element of z has been assigned.

I see that sigma and sigma1 don’t do anything other than set the prior for beta and beta0.
Making them parameters with very wide distribution is pretty much the same as just giving betas very wide priors.

I think estimating x looks quite difficult and I recommend you start with a simpler model if possible.

data {
  int<lower=0> province;
  int<lower=0> n_grid;
  int<lower=0> N;
  vector<lower=0>[province] p;
  matrix[province, n_grid] kernel;
  vector[N] age;
  vector[N] time;
  vector[n_grid] x;
  int censor_status[N];
}
parameters {
  real beta0;
  real beta;
}
transformed parameters {
  real time_ratio;
  vector[N] lambdaa;
  time_ratio = exp(beta);
  for (n in 1 : N) {
    lambdaa[n] = exp(beta0 + beta * age[n]);
  }
}
model {
  matrix[province, n_grid] landa;
  vector[province] z;
  vector[province] a;
  for (j in 1 : n_grid) {
    target += normal_lpdf(x[j]| 0, 1);
  }
  target += cauchy_lpdf(beta0| 0, 5);
  target += cauchy_lpdf(beta| 0, 5);
  for (k in 1 : province) {
    for (j in 1 : n_grid) {
      landa[k, j] = kernel[k, j] * exp(x[j]);
    }
    z[k] = sum(landa[k,  : ]) * p[k];
  }
  a = z / sum(z);
  for (k in 1 : province) {
    for (i in 1 : N) {
      if (censor_status[i] == 1) {
        // target is the _logarithm_ of probability
        // so I think this what you were going for
        target += log(a[k]) + exponential_lpdf(time[i]| lambdaa[i]);
      } else {
        target += log(a[k]) + exponential_lccdf(time[i]| lambdaa[i]);
      }
    }
  }
}

Isa_Nazar · May 24, 2021, 12:11pm

Thank you so much for your great help. Actually, I’m new to Stan.
Can you help me to run my model, please?
How can I compute sum(z) outside the loop that initializes z?

Here, x[j]'s (j=1,2, …, n_grid) are white noise and It is used only in the calculation of a and there is no need to estimate it.

Best regards,
Eisa

nhuurre · May 24, 2021, 12:18pm

I posted a complete model above but here’s the relevant part

for (k in 1:province) {
    for (j in 1:n_grid) {
        landa [k,j] = kernel [k,j] * exp (x[j]);
    }
    z[k] = sum(landa[k,]) * p[k];
}
a = z / sum(z);

Isa_Nazar · May 24, 2021, 1:10pm

I really appreciate your help. Now the model is running and there is no problem. I need the calculated values of a in the model, how can I get it in the output?

Thank you!

nhuurre · May 24, 2021, 4:19pm

You can move z and a to the transformed parameters block. If you don’t want to save z you can place it in a nested block.

transformed parameters {
  real time_ratio;
  vector[N] lambdaa;
  vector[province] a; // `a` saved to output because it's a toplevel transformed parameter
  time_ratio = exp(beta);
  for (n in 1 : N) {
    lambdaa[n] = exp(beta0 + beta * age[n]);
  }
  { // `z` is only visible inside these braces, not part of output
    vector[province] z;
    for (k in 1 : province) {
      vector[n_grid] landa_k;
      for (j in 1 : n_grid) {
        landa_k[j] = kernel[k, j] * exp(x[j]);
      }
      z[k] = sum(landa_k) * p[k];
    }
    a = z / sum(z); // assign `a`
  }
}

Isa_Nazar · May 24, 2021, 4:46pm

I’m so grateful for your help and support. It was a challenging model for me but you made it easier. Thank you.

Sincerely yours,
Eisa

Topic		Replies	Views
Issues with running spatial survival model in Stan Modeling rstan	5	431	May 24, 2021
Issues with running survival model in Stan Modeling rstan , techniques , fitting-issues	4	466	May 24, 2021
Rstan parser error General	6	486	November 16, 2022
How to speed up my Stan code and sampling in rstan? Modeling rstan , fitting-issues , performance	8	1169	May 28, 2021
Unexpected error in running vb() Modeling	4	589	August 14, 2023

Issues with running survival model in Stan

Related topics