How to make a model more efficient

Hello,

I have a fairly simple model that is running fine… but it’s too slow. I know there is way to vectorize everything to avoid loops and improve the speed, but every single thing I’ve done following various postings leads the code to crash (I don’t even get an error… it just doesn’t run).

Here is the code:

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}

parameters {
real a[nsite,ngroup,nharm];
}

transformed parameters {
vector[n] mu;
for(i in 1:n)
{
mu[i]=exp(a[site[i],group[i],1]*Z[bin[i],1]+a[site[i],group[i],2]*Z[bin[i],2]+a[site[i],group[i],3]*Z[bin[i],3]+a[site[i],group[i],4]*Z[bin[i],4]+a[site[i],group[i],5]*Z[bin[i],5]+a[site[i],group[i],6]*Z[bin[i],6]+a[site[i],group[i],7]*Z[bin[i],7]+a[site[i],group[i],8]*Z[bin[i],8]+a[site[i],group[i],9]*Z[bin[i],9]);
}
}

model {

for(k in 1:nharm)
{
for(i in 1:ngroup)
{
for(j in 1:nsite)
{
a[j,i,k]~normal(0,100);
}
}
}
pattern~poisson(mu);
}

generated quantities {
int<lower=0> newpattern[n];
for(i in 1:n)
{
newpattern[i]=poisson_rng(mu[i]);
}
}

I tried to vectorize newpattern so I don’t use a loop like I did in the model section for pattern… but that doesn’t run… I tried to vectorize mu in generated quantities… didn’t work… I tried many different ways to remove the loop to give a prior for each element of the array a… nothing worked. I have no idea left regarding what I can try to do. Any help would be greatly appreciated!

Thanks!

You can replace your three level loop and this will probably speed somethings up.

to_vector(to_array_ld(a)) ~ normal(0, 100)

Further improvement might come when you use matrix algebra with a and Z.

Thank you! I tried your suggestion for a and it worked. It makes the model run a little faster.

How would I use matrix algebra between a and Z? I tried many different options and no luck:
I tried outside the loop like this: mu=exp(a[site,group,:](Z[bin,:])’) (with or without the : ), I tried the same thing in the loop mu=exp(a[site[i],group[i],:](Z[bin[i],:])’).

Here is how I would do it.

transformed data{
  int<lower=0, upper=nsite * ngroup> index_a[n] = (site - 1) * ngroup + group;
}
parameters{
  matrix[nharm, nsite * ngroup] a;
}
model{
  matrix[nbin, nsite * ngroup] Za = Z * a;
  to_vector(a) ~ normal(0, 100);
  pattern ~ poisson(exp(Za[bin, index_a]));
}
generated quantities{
  int <lower=0> newpattern[n];
  {
    matrix[nbin, nsite * ngroup] Za = Z * a
    for(i in 1:n){
      newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]))
    }
  }
}

Avoiding the transformed parameters could save time by not having to save the intermediate mu variable for every iteration.

Doing the intermediate calculation for Za instead of mu is cutting down on the number of calculations. I think you won’t be penalised by the indexation in Za[bin, index_a].

I also think that narrower priors might help. The priors seem very flat now. With a = 100, an increase of 1 in Z implies a multiplicative effect of exp(100) on new pattern.

Great, thank you so much for the tips!

It is not working yet (and sadly not giving me a reason why “Error in stanc(file = file, model_code = model_code, model_name = model_name, : c++ exception (unknown reason)”). I’m changing a few things around to figure out where the issue is coming from. Hopefully I can find the source of the error.

Building Rstan from source might help to get more informative error messages.

I already discovered some mistakes in my suggestion. I will report back when I solved them all. Sorry about that.

The model below passes the stanc tests. The biggest problem was that I mixed arrarys and vectors. That is why I need the loop in transformed data and the intermediate variable Za_array in the model block. I also forgot some closing ;'s in the generated quantity block.

I hope this works and speeds things up. You might have to check whether the Za_array keeps the right order for the observations. I think it should but I might be wrong, I am still not 100% confident in the indexes.

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}
transformed data{
  int<lower=0, upper=nsite * ngroup> index_a[n];
  for (i in 1:n){
    index_a[i] = (site[i] - 1) * ngroup + group[i];
  }

}
parameters{
  matrix[nharm, nsite * ngroup] a;
}
model{
  matrix[nbin, nsite * ngroup] Za = Z * a;
  to_vector(a) ~ normal(0, 100);
  { 
    real Za_array[n] = to_array_1d(Za[bin, index_a]);
    pattern ~ poisson(exp(Za_array));
  }
}
generated quantities{
  int <lower=0> newpattern[n];
  {
    matrix[nbin, nsite * ngroup] Za = Z * a;
    for(i in 1:n){
      newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]]));
    }
  }
}

Awesome. I ended up having the change the Za_array because to_array_1d(Za[bin, index_a]) would lead to a vector of size length(bin)*length(index_a). So I added a little loop… which might slow things up a bit. But it is still much faster than before!

data {
int<lower=0> nsite;
int<lower=0> ngroup;
int<lower=0> n;
int<lower=0> nbin;
int<lower=0> nharm;
int<lower=0> site[n];
int<lower=0> group[n];
int<lower=0> bin[n];
matrix[nbin,nharm] Z;
int<lower=0> pattern[n];
}

transformed data{
int<lower=0, upper=nsitengroup> index_a[n];
for (i in 1:n){
index_a[i] = (site[i] - 1)ngroup + group[i];
}
}
parameters{
matrix[nharm, nsite
ngroup] a;
}
model{
matrix[nbin, nsite * ngroup] Za = Z
a;
to_vector(a) ~ normal(0, 100);
{
real Za_array[n];
for(i in 1:n)
{
Za_array[i]= Za[bin[i], index_a[i]];
}
pattern ~ poisson(exp(Za_array));
}
}
generated quantities{
int <lower=0> newpattern[n];
{
matrix[nbin, nsite * ngroup] Za = Z * a;
for(i in 1:n){
newpattern[i] = poisson_rng(exp(Za[bin[i], index_a[i]]));
}
}
}

Do you have the latest RStan? It should be giving you a reason. (I think there may be some residual issue on some Mac installations, which is perhaps why @stijn was suggesting recompiling R!)