Model vectorization

Hi there, I have a hierarchy of Beta distribution and my model is sooo slow. I am looking for a way to improve the run-time. By removing all the fluffs, I have the following model (assume that L is a large value and in the model I need to access l_sens[a, b] where 1<= a <= L and 1 <= b <= n_sens_buckets ):

parameters {
    vector<lower=0, upper=1>[n_sens_buckets] sensitivity;
    vector<lower=0, upper=1>[n_sens_buckets] l_sens[L];
}

model {
    sensitivity ~ beta(9.0, 1.0);

    for (i in 1:L) {
        l_sens[i] ~ beta(10 * sensitivity, 10 * (1 - sensitivity));
    }

I tried to vectorize the model section as follow to replicate the above for loop:

model {
     sensitivity ~ beta(9.0, 1.0);
     l_sens ~ beta(rep_array(sensitivity * 10, L), rep_array((1.0 - sensitivity) * 10, L));
}

However, I ran into the following error:

ValueError: SYNTAX ERROR, MESSAGE(S) FROM PARSER:

No matches for: 

  vector[] ~ beta(vector[], vector[])

That would be great if someone could help me with that. Thanks

I am just a novice and I hope I don’t cause more harm than good here.

This toy model works for me using rstan, so I do not think you have to make a vector out of your sensitivity value.

set.seed(1)
DAT <- rbeta(500, shape1 = 9.3, shape2 = 10*(1-0.93))

MODEL = "
data {
 int N;
 vector<lower = 0, upper = 1>[N] DAT;
}
parameters {
 real<lower = 0> sensitivity;
}
model {
 sensitivity ~ beta(9.0, 1.0);
 DAT ~ beta(10 * sensitivity, 10* (1-sensitivity));
}
"
datList <- list(DAT = DAT, N = 500)
FIT <- stan(model_code = MODEL, data = datList)

Maybe it is due to my lack of experience but I am surprised you have a parameter on the left side of the sampling statement

l_sens ~ beta(rep_array(sensitivity * 10, L), rep_array((1.0 - sensitivity) * 10, L));

It seems to me that leaves the model with no data against which to fit. Am I completely misunderstanding your code?

Thanks @FJCC for looking into that. remember that it is just a small part of of model and not the full one and observations will play a role somewhere else but I just added part of the code I need vectorization because it made my model so slow.
Please see that l_sens has the shape of L * n_sens_buckets and L is a big value. It means that for each 1 <= i <= L we need to sample from Beta which is also a vector of size n_sens_buckets.

The loop might be a bit faster if you reorder it so that sensitivity is scalar

for (i in 1:n_sens_buckets) {
    l_sens[:,i] ~ beta(10 * sensitivity[i], 10 * (1 - sensitivity[i]));
}

but loops are already fast so I don’t think this part is what’s making your model so slow. Could you post the full model?

1 Like

Note, this is common in hierarchical models (indeed, arguably the defining feature).

Thanks! I was confused because I thought I was looking at the entire model { } code. I did a poor job of expressing myself but I looking for data somewhere in the model.

1 Like