Reduce_sum function in stan

Isa_Nazar · December 4, 2021, 11:50am

Dear all,

I’m new to Stan. The following paragraph is my Stan code for the hierarchical survival model. But I had trouble running my model (about ~ 8 days for iter=10000, warm up=5000, N=35000). Now, I want to run my model by with-in chain parallelization method with reduce_sum (25.1 Reduce-sum | Stan User’s Guide).

How can I run my model using this method? Can anyone help me to change my codes in RSTAN?

Thank you in advance!

My Stan code is as below (N=35000, province=31, num=12):


'''
functions{
   real icar_normal_lpdf(vector phi, int province, int[] node1, int[] node2) {
    return -0.5 * dot_self(phi[node1] - phi[node2]);
  }
  // defines the log survival
  vector log_S (vector t, real shape,vector scale){
    vector[num_elements(t)] log_S;
    for (i in 1:num_elements(t)){
      log_S[i] =  log(1- ((scale [i]*(t[i])^shape)/ (1+(scale [i]*(t[i])^shape)))); // county is a vector that shows the location of patients with dim = num_elements(t) 
    }
    return log_S;
  }
  //defines the log pdf
  vector log_f (vector t,real shape,vector scale){
    vector[num_elements(t)] log_f;
    for (i in 1:num_elements(t)){
      log_f[i] =log((scale[i]*shape*t[i]^(shape-1))/((1+(scale[i]*t[i]^shape))^2));
    }
    return log_f;
  }
  //defines the log-likelihood for right censored data
  real surv_loglogistic_lpdf(vector t,vector d, real shape,vector scale){
    vector[num_elements(t)] log_lik;
    real prob;
    log_lik = d .* log_f(t, shape, scale)+ (1 - d) .* log_S(t, shape, scale);
    prob = sum(log_lik);
    return prob;
  }
   real partial_sum(array[] int y_slice,
                   int start, int end,vector d,
                   real shape, vector scale) {
    return surv_loglogistic_lpdf(y_slice | d, shape, scale);
  }
}
//data block
data{
  int <lower=0> province;
  int <lower=0> N;
  int <lower=0> num;
  int <lower=0> number;
  int <lower =0 , upper = province> county[N];
  matrix [N,number] respon;
   int<lower=0> N_edges;
  int<lower=1, upper=province> node1[N_edges];  // node1[i] adjacent to node2[i]
  int<lower=1, upper=province> node2[N_edges];  // and node1[i] < node2[i]
}
//parameters block
parameters{
  vector [num] beta;
  real <lower=0> sigma;//scale parameter sigma=1/shape
  vector <lower=0> [province] sigma1;
  vector[province] w1;  // random effect (CAR)
  vector[province] theta;       // heterogeneous effects  
}
// transformed parameters block
transformed parameters{
  vector [num] time_ratio;
  //vector[N] linpred;
  vector[N] lambdaa;
  real <lower=0> alpha = 1 / sigma;
time_ratio  = exp(beta);
  lambdaa = exp ((-respon[,7:18] * beta - w1[county] - theta[county]) / sigma);
}
// model block
model{
  target += normal_lpdf(beta | 0, 1000);
  //target += student_t_lpdf(sigma| 3,0,1); 
  target += gamma_lpdf(alpha | 0.001, 0.001);
   
   w1 ~ icar_normal_lpdf(province, node1, node2);
 
  sum(w1) ~ normal(0, 0.001 * province);
   

   
    target += normal_lpdf(theta| 0, sigma1);
        target += student_t_lpdf(sigma1| 3,0,1); 


  respon[,4] ~ surv_loglogistic//model for the data
}
// generated quantities block
generated quantities{
  vector[N] log_lik;//log-likelihood
  
  { for(n in 1:N){
    log_lik[n] = respon[n,6] * (log((exp ((-respon[n,7:18] * beta - w1[county[n]] - theta[county[n]])*alpha)*alpha*respon[n,4]^(alpha-1)) / ((1+(exp ((-respon[n,7:18] * beta - w1[county[n]] - theta[county[n]])*alpha)*(respon[n,4])^(alpha)))^2))) + (1-respon[n,6]) * ( log(1- ((exp ((-respon[n,7:18] * beta - w1[county[n]] - theta[county[n]])*alpha)*(respon[n,4])^alpha) / (1+(exp ((-respon[n,7:18] * beta - w1[county[n]] - theta[county[n]])*alpha)*(respon[n,4])^alpha)))));}
  }
}
'''

Ara_Winter · December 7, 2021, 7:43pm

Hello,

I cleaned up your post so the model code was easier to read. A few things that will help folks answer your question:
What version of Stan are you running?
Is this in R or python?
Can you provide some simulated data (small data set) or a snippet of real data to play with?
Did you run simulated data with known parameters and did the model recover those parameters?

Thanks!

Isa_Nazar · December 21, 2021, 8:34am

Sorry for the long delay, and thanks for your reply. Actually, I’m new to rstan, and I installed the latest version of R. Now, my model recovers the parameters well. But, I had a problem in within-chain parallelization of my codes in RSTAN because the run of it for my real data longs about 7 days(N=35000).
Can you help me to solve this issue?

Thank you in advance!

wds15 · December 21, 2021, 1:11pm

Rstan does not support reduce sum s you need at least Stan version 2.24 (I think)…so use cmdstanr and install cmdstan 2.28.2.

wds15 · September 27, 2023, 12:44pm

RStan 2.26 is on CRAN now and it includes reduce_sum!

wpetry · September 27, 2023, 12:49pm

Yes it does. v2.26.23 has a startup message that gives you tips on how to set options when using reduce_sum"

> library(rstan)
Loading required package: StanHeaders

rstan version 2.26.23 (Stan version 2.26.1)

For execution on a local, multicore CPU with excess RAM we recommend calling
options(mc.cores = parallel::detectCores()).
To avoid recompilation of unchanged Stan programs, we recommend calling
rstan_options(auto_write = TRUE)
For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
change `threads_per_chain` option:
rstan_options(threads_per_chain = 1)

Topic		Replies	Views
RStan parallelising using reduce_sum() Modeling	1	439	July 30, 2021
Within-chain parallelization in Stan Developers rstan , fitting-issues	2	1138	June 19, 2021
Parallelization in Stan General	6	501	October 24, 2020
How to use reduce_sum in for loops? Modeling	8	363	January 10, 2024
If I can spped up base my reduce sum stan code Modeling rstan	1	240	July 17, 2023

Reduce_sum function in stan

Related topics