How to make a model more efficient

Bene_Bachelot · November 15, 2017, 3:17am

Hello,

I have a fairly simple model that is running fine… but it’s too slow. I know there is way to vectorize everything to avoid loops and improve the speed, but every single thing I’ve done following various postings leads the code to crash (I don’t even get an error… it just doesn’t run).

Here is the code:

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}

parameters {
real a[nsite,ngroup,nharm];
}

transformed parameters {
vector[n] mu;
for(i in 1:n)
{
mu[i]=exp(a[site[i],group[i],1]*Z[bin[i],1]+a[site[i],group[i],2]*Z[bin[i],2]+a[site[i],group[i],3]*Z[bin[i],3]+a[site[i],group[i],4]*Z[bin[i],4]+a[site[i],group[i],5]*Z[bin[i],5]+a[site[i],group[i],6]*Z[bin[i],6]+a[site[i],group[i],7]*Z[bin[i],7]+a[site[i],group[i],8]*Z[bin[i],8]+a[site[i],group[i],9]*Z[bin[i],9]);
}
}

model {

for(k in 1:nharm)
{
for(i in 1:ngroup)
{
for(j in 1:nsite)
{
a[j,i,k]~normal(0,100);
}
}
}
pattern~poisson(mu);
}

generated quantities {
int<lower=0> newpattern[n];
for(i in 1:n)
{
newpattern[i]=poisson_rng(mu[i]);
}
}

I tried to vectorize newpattern so I don’t use a loop like I did in the model section for pattern… but that doesn’t run… I tried to vectorize mu in generated quantities… didn’t work… I tried many different ways to remove the loop to give a prior for each element of the array a… nothing worked. I have no idea left regarding what I can try to do. Any help would be greatly appreciated!

Thanks!

stijn · November 15, 2017, 6:28am

You can replace your three level loop and this will probably speed somethings up.

to_vector(to_array_ld(a)) ~ normal(0, 100)

Further improvement might come when you use matrix algebra with a and Z.

Bene_Bachelot · November 15, 2017, 3:12pm

Thank you! I tried your suggestion for a and it worked. It makes the model run a little faster.

How would I use matrix algebra between a and Z? I tried many different options and no luck:
I tried outside the loop like this: mu=exp(a[site,group,:](Z[bin,:])’) (with or without the : ), I tried the same thing in the loop mu=exp(a[site[i],group[i],:](Z[bin[i],:])’).

stijn · November 16, 2017, 2:20am

Here is how I would do it.

transformed data{
  int<lower=0, upper=nsite * ngroup> index_a[n] = (site - 1) * ngroup + group;
}
parameters{
  matrix[nharm, nsite * ngroup] a;
}
model{
  matrix[nbin, nsite * ngroup] Za = Z * a;
  to_vector(a) ~ normal(0, 100);
  pattern ~ poisson(exp(Za[bin, index_a]));
}
generated quantities{
  int <lower=0> newpattern[n];
  {
    matrix[nbin, nsite * ngroup] Za = Z * a
    for(i in 1:n){
      newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]))
    }
  }
}

Avoiding the transformed parameters could save time by not having to save the intermediate mu variable for every iteration.

Doing the intermediate calculation for Za instead of mu is cutting down on the number of calculations. I think you won’t be penalised by the indexation in Za[bin, index_a].

I also think that narrower priors might help. The priors seem very flat now. With a = 100, an increase of 1 in Z implies a multiplicative effect of exp(100) on new pattern.

Bene_Bachelot · November 17, 2017, 3:27am

Great, thank you so much for the tips!

It is not working yet (and sadly not giving me a reason why “Error in stanc(file = file, model_code = model_code, model_name = model_name, : c++ exception (unknown reason)”). I’m changing a few things around to figure out where the issue is coming from. Hopefully I can find the source of the error.

stijn · November 17, 2017, 3:48am

Building Rstan from source might help to get more informative error messages.

I already discovered some mistakes in my suggestion. I will report back when I solved them all. Sorry about that.

stijn · November 17, 2017, 3:58am

The model below passes the stanc tests. The biggest problem was that I mixed arrarys and vectors. That is why I need the loop in transformed data and the intermediate variable Za_array in the model block. I also forgot some closing ;'s in the generated quantity block.

I hope this works and speeds things up. You might have to check whether the Za_array keeps the right order for the observations. I think it should but I might be wrong, I am still not 100% confident in the indexes.

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}
transformed data{
  int<lower=0, upper=nsite * ngroup> index_a[n];
  for (i in 1:n){
    index_a[i] = (site[i] - 1) * ngroup + group[i];
  }

}
parameters{
  matrix[nharm, nsite * ngroup] a;
}
model{
  matrix[nbin, nsite * ngroup] Za = Z * a;
  to_vector(a) ~ normal(0, 100);
  { 
    real Za_array[n] = to_array_1d(Za[bin, index_a]);
    pattern ~ poisson(exp(Za_array));
  }
}
generated quantities{
  int <lower=0> newpattern[n];
  {
    matrix[nbin, nsite * ngroup] Za = Z * a;
    for(i in 1:n){
      newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]]));
    }
  }
}

Bene_Bachelot · November 17, 2017, 4:41pm

Awesome. I ended up having the change the Za_array because to_array_1d(Za[bin, index_a]) would lead to a vector of size length(bin)*length(index_a). So I added a little loop… which might slow things up a bit. But it is still much faster than before!

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}

transformed data{
int<lower=0, upper=nsitengroup> index_a[n];
for (i in 1:n){
index_a[i] = (site[i] - 1)ngroup + group[i];
}
}
parameters{
matrix[nharm, nsitengroup] a;
}
model{
matrix[nbin, nsite * ngroup] Za = Za;
to_vector(a) ~ normal(0, 100);
{
real Za_array[n];
for(i in 1:n)
{
Za_array[i]= Za[bin[i], index_a[i]];
}
pattern ~ poisson(exp(Za_array));
}
}
generated quantities{
int <lower=0> newpattern[n];
{
matrix[nbin, nsite * ngroup] Za = Z * a;
for(i in 1:n){
newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]]));
}
}
}

Bob_Carpenter · November 27, 2017, 10:04pm

Do you have the latest RStan? It should be giving you a reason. (I think there may be some residual issue on some Mac installations, which is perhaps why @stijn was suggesting recompiling R!)

Topic		Replies	Views
How can I make my script more efficient? Depending on the size of the data matrix it can take a while to run Modeling techniques , specification , performance	3	478	April 21, 2023
Vectorization and non-centered parameterization General	1	446	May 17, 2019
Help with vectorizing for loops Modeling	12	844	January 28, 2022
Any ideas for speeding up this chunk of my transformed parameters block? Modeling	13	560	March 25, 2022
Increasing Stan efficiency by vectorizing for loop Modeling	6	680	October 9, 2022

How to make a model more efficient

Related topics