Dear all,
I’m trying to fit a spatial survival model using RSTAN for large data (N=35000). But, I found sampling is very time-consuming (takes ~ 10 days even for 2000 iteration/500 warmup).
Could anyone help me to solve this problem, please?
Any tips and comments are welcome. Thanks in advance!
My stan code is as below (N=35000, n_grid=350, province=31, num=12).
In my codes, x[j]'s (j=1,2, …, n_grid) are white noise and It is used only in the calculation of a and there is no need to estimate it.
fit < - stan(“final model.stan”,data = da,iter=iter, warmup=warmup, cores =4,chains = 2, thin = 1, init = “random”,control=list(max_treedepth=15))
data {
int <lower=0> province;
int <lower=0> n_grid;
int <lower=0> N;
int <lower=0> num;
int <lower=0> number;
vector <lower=0> [province] p;
row_vector[n_grid] kernel[province];
matrix [N,number] respon;
}
parameters{
vector [num] beta;
real <lower=0> sigma0;
real <lower=0> sigma;
vector [n_grid] x;
}
transformed parameters{
real alpha = 1 / sigma;
vector [N] lambdaa;
vector [n_grid] exp_x;
vector[province] a; // `a` saved to output
exp_x = exp(x);
lambdaa = exp ((-1 * respon[ ,6:17] * beta) / sigma);
{ // `z` not part of output
vector[province] z;
vector[province] landa;
for (k in 1 : province) {
landa[k] = kernel[k] * exp_x * p[k];
}
a = landa / sum(landa); // assign `a`
}
}
model{
target += normal_lpdf(x|0,5);
target += normal_lpdf(beta | 0, sigma0);
target += inv_gamma_lpdf(sigma0| 0.001, 0.001);
target += normal_lpdf(log (alpha) | 0, 5);
for (i in 1:N) {
vector [province] f_i;
vector [province] s_i;
for (k in 1:province){
f_i[k]= a[k]*((lambdaa [i]*alpha*respon[i,4]^(alpha-1))/((1+(lambdaa [i]*respon[i,4]^alpha))^2));
s_i[k]= a[k]*(1- ((lambdaa [i]*respon[i,4]^alpha)/ (1+(lambdaa [i]*respon[i,4]^alpha))));
}
target += (respon[i,5] * log (f_i)) + ((1 - respon[i,5])* log(s_i));
}
}
EDIT: @maxbiostat edited this post for syntax highlighting.