Hi,

This is my first attempt at trying to implement reduce_sum to improve my model’s run time. I have pasted the model code below in case anyone is kind enough to take a look, and I’d be happy to provide more context on the model itself if needed. The reduce_sum model seems to be working correctly (results are identical to the non-reduce_sum version which I implemented first) but I am not seeing any improvements in run time. I read through several posts here on the forum and I wonder whether the way I have structured the model is creating too much overhead, but unfortunately I am not experienced enough to be able to figure it out. I wonder if someone with much more experience than me could spot the culprit just by eyeballing the code (if indeed there is a culprit, it’s also possible that this model simply does not lend itself to the use of reduce_sum?). Note that I am currently slicing over 30 years of data and I am using two chains, 4 cores per chain, and a grainsize = 1. Grainsizes of 2-4 did not make a difference. With this little cores I am not expecting huge performance improvements, but I was expecting at least a little. I could have access to more cores but I would first like to make sure that I am not doing something fundamentally clumsy in how I wrote the code given my lack of experience. Any insight would be greatly appreciated, and again I’d be happy to provide more details.

Thanks!

My current setup:

Windows

Rstan 2.32.3

Stan 2.26.1

Cmdstanr 0.7.0

CmdStan 2.33.1

```
functions {
//1. Function that calculates model predictions (lyhat) for one year
array[] vector year_model(
matrix pet_ac_mat, //For each year, this has size 79000*20
int nnode, //Equal to 86798
vector pcp_dir, //For each year, this has size 79000
vector param, //Vector of 20 shared parameters
int ncat, //Equal to 79000
int numobs, //Vector with 30 elements, each with the number of observations for the dependent variable in each yeare
array[] int from_cd, //This vector has size 79000
array[] int to_cd, //This vector has size 79000
vector depvar, //Dependent variable. For each year, this vector has size varying between 200 and 200
array[] real Dfac, //This vector has size 79000
array[] real Id, //This vector has size 79000
array[] int iftr, //This vector has size 79000
array[] real transf) { //This vector has size 79000
array[nnode] real pnode;
vector[numobs] lyhat;
vector[ncat] incremental;
int fnode;
int tnode;
int ie;
real cat_total;
real cat_total_scaled;
vector[numobs] ldepvar;
ie = 0;
for (j in 1:nnode) {
pnode[j] = 0.0;
}
incremental = pcp_dir - pet_ac_mat * param;
for (i in 1:ncat) { #ncat = 79000
fnode = from_cd[i];
tnode = to_cd[i];
cat_total = incremental[i] + transf[i] * pnode[fnode];
if(cat_total<0) {
cat_total = 0.0;
}
if(depvar[i] != 0.0 ) {
if(Id[i] == 1){
cat_total_scaled = incremental[i] * Dfac[i];
} else {
if(Id[i] == 2){
cat_total_scaled = rchpld * Dfac[i];
} else {
cat_total_scaled = (cat_total - incremental[i]) * Dfac[i];
}
}
if(i == 79826){
cat_total_scaled = cat_total_scaled + pnode[2006];
}
ie = ie + 1;
if (cat_total_scaled > 0) {
lyhat[ie] = log(cat_total_scaled);
} else {
lyhat[ie] = log(1.0);
}
ldepvar[ie] = log(depvar[i]);
} // end depvar if
pnode[tnode] = pnode[tnode] + iftr[i] * cat_total;
} // end reach loop
return({ lyhat, ldepvar });
} //End of year_model function
//2. Partial_sum function that calculates the likelihood contribution for years m = start ... end
// by using function above
real year_lpdf(array[] matrix pet_ac_mat,
int start, int end,
int nnode,
matrix pcp_dir,
vector param,
int ncat,
array[] int numobs,
array[] int from_cd,
array[] int to_cd,
matrix depvar,
array[] real Dfac,
array[] real Id,
array[] int iftr,
array[] real transf,
real bsigma
) {
real log_lik = 0.0;
for (m in start:end) {
int m_slice = m - start + 1; //index within the slice (re-starts at 1 for each new slice)
array[2] vector[numobs[m]] local = year_model(
pet_ac_mat_slice[m_slice],
nnode,
col(pcp_dir, m),
param,
ncat,
numobs[m],
from_cd,
to_cd,
col(depvar, m),
Dfac,
Id,
iftr,
transf);
vector[numobs[m]] lyhat;
vector[numobs[m]] ldepvar;
lyhat = local[1];
ldepvar = local[2];
log_lik += normal_lpdf(ldepvar | lyhat, bsigma);
}
return(log_lik);
}
} //End of functions block
data {
int<lower=0> ncat; //79000
int<lower=0> nnode; //86798
int nyears; //30
int<lower=0> numobs_max; //448
array[nyears] int numobs; //number of observations in each year
array[ncat] int iftr; //size of array = 79000
matrix[ncat, nyears] pcp_dir; //size of matrix = 79000*30
array[nyears] matrix[ncat, 20] pet_ac_mat; //array of 30 matrices, each with size = 79000*20
array[ncat] int from_cd; //size of array = 79000
array[ncat] int to_cd; //size of array = 79000
array[ncat] real transf; //size of array = 79000
matrix[ncat, nyears] depvar; //size of matrix = 79000*30
array[ncat] real Dfac; //size of array = 79000
array[ncat] real Id; //size of array = 79000
}
parameters {
vector<lower=0, upper=10> [20] param;
real<lower=0, upper=5> bsigma;
}
model {
param ~ normal(0,5);
bsigma ~ normal(0,5);
target += reduce_sum(
year_lpdf,
pet_ac_mat,
1,
nnode,
pcp_dir,
param,
ncat,
numobs,
from_cd,
to_cd,
depvar,
Dfac,
Id,
iftr,
transf,
bsigma);
} // End of model statement
```