Applying Vectorization in generated quantities section

Murali_037 · March 20, 2020, 5:18pm

Hi Everyone,

I am trying to convert all the for loops into vectorized form so that the calculations are done faster.

But, when I try to change the for loops to vectorized form in generated quantities section, Im getting a dimension mismatch in assignment error.

My question is, vectorization is not applicable to generated quantities section?

Thanks in advance,

mike-lawrence · March 20, 2020, 7:56pm

Could you please post some code so we can take a look at how you’re currently attempting to do this?

tiagocc · March 20, 2020, 8:48pm

Yes, it is possible to vectorize rng functions, but as the error tells you,

you need to ensure you have matching dimensions on both sides of the rng function.

For example, given the data

data {
int N;
vector[N] y;
}

You can create the following vector of ones which you can then use on the generated quantities block to ensure your functions have the correct dimensions.

transformed data {
vector[N] ones_N = rep_vector(1, N);
}
parameters {
real <lower = 0> lambda;
}

model {
lambda ~ exponential(1);
target += exponential_lpdf(y | lambda);
}

generated quantities {
real y_rep[N]  = exponential_rng(ones_N * lambda);
}

Does, avoiding looping over every data point.

Hope it helps!

Murali_037 · March 20, 2020, 11:25pm

Hi@tiagocc

I will try to implement this and update tomorrow If I’m able to make my model work. Thanks for the tip

Murali_037 · March 24, 2020, 2:58pm

Hi @tiagocc,

Can you show an example of using the ones for normal_rng function?

tiagocc · March 31, 2020, 5:01pm

Sorry it took so long to respond.

I would do something like:


transformed data {
  int N = 10;
  vector[N] ones_N = rep_vector(1, N);
}

generated quantities {
  real yrep [N] = normal_rng(ones_N * 1, ones_N * 10);  
}

mitzimorris · July 1, 2020, 4:49pm

a little clarification on the above example - which is entirely correct -
as this issue just bubbled up again - Minor typos in Stan user guide · Issue #208 · stan-dev/docs · GitHub

intuitively, one would want yrep to be of type vector[N] just as y is.
however, this doesn’t compile.

the definition of normal_rng in the Stan Functions Reference Manual - Stan Functions Reference
says:

R normal_rng (reals mu, reals sigma)
Generate a normal variate with location mu and scale sigma; may only be used in transformed data and generated quantities blocks. For a description of argument and return types, see section vectorized PRNG functions.

that definition uses type R - in the prng-vectorization section, 10.8.3.3 Return type - it says:

The result of a vectorized PRNG function depends on the size of the arguments and the distribution’s support. If all arguments are scalars, then the return type is a scalar. For a continuous distribution, if there are any non-scalar arguments, the return type is a real array ( real[] ) matching the size of any of the non-scalar arguments, as all non-scalar arguments must have matching size. Discrete distributions return ints and continuous distributions return reals , each of appropriate size. The symbol R denotes such a return type.

in short, the vectorized RNG function normal_rng return scalars or int or real arrays. not matrix, vector, or row_vector.

Bob_Carpenter · July 7, 2020, 7:58pm

That only helps if there’s autodiff inside the loop that is more efficient outside the loop. Loops themselves are very fast in Stan. That means this

real y[N];
for (n in 1:N)
  y[n] ~ normal(mu, sigma);

is a lot slower than the vectorized form

y ~ normal(mu, sigma);

tghat doesnt’ carry over everywhere. For example,

real x[N];
real y[N] = exp(x);

isn’t any faster than

real x[N];
real y[N];
for (n in 1:N) 
  y[n] = exp(x[n]);

because all the vectorized exp function does is a loop down in C++ just like the loop code.

Edit: There’s no autodiff in transformed parameters or generated quantities, so there won’t be a speed difference between loops or vectorized operations for those functions. There still will be a big difference between matrix operations and loops. You definitely want to call matrix operations where possible because they have lots of optimizations built in.

Topic		Replies	Views
Vectorization of the normal_rng and bernouli_rng Modeling	7	681	August 6, 2024
Inspecting data fit of stan model/ problems with generated quantities code block Modeling rstan	4	497	October 27, 2022
Problem Simulating Data with Generated Quantities (Dimension mismatch in assignment; type = real; right-hand side type = real[ ]) RStan rstan , techniques , specification	3	857	April 29, 2021
Bernoulli_logit_rng not vectorized in RStan 2.18.2? Modeling	2	411	July 29, 2019
Problem in generated quantities Modeling specification	2	702	November 15, 2019

Applying Vectorization in generated quantities section

Related topics