Hi Everyone,

I am trying to convert all the for loops into vectorized form so that the calculations are done faster.

But, when I try to change the for loops to vectorized form in generated quantities section, Im getting a dimension mismatch in assignment error.

My question is, vectorization is not applicable to generated quantities section?

Thanks in advance,

2 Likes

Could you please post some code so we can take a look at how youâ€™re currently attempting to do this?

1 Like

Yes, it is possible to vectorize rng functions, but as the error tells you,

you need to ensure you have matching dimensions on both sides of the rng function.

For example, given the data

```
data {
int N;
vector[N] y;
}
```

You can create the following vector of ones which you can then use on the `generated quantities`

block to ensure your functions have the correct dimensions.

```
transformed data {
vector[N] ones_N = rep_vector(1, N);
}
parameters {
real <lower = 0> lambda;
}
model {
lambda ~ exponential(1);
target += exponential_lpdf(y | lambda);
}
generated quantities {
real y_rep[N] = exponential_rng(ones_N * lambda);
}
```

Does, avoiding looping over every data point.

Hope it helps!

Hi@tiagocc

I will try to implement this and update tomorrow If Iâ€™m able to make my model work. Thanks for the tip

Hi @tiagocc,

Can you show an example of using the ones for normal_rng function?

1 Like

Sorry it took so long to respond.

I would do something like:

```
transformed data {
int N = 10;
vector[N] ones_N = rep_vector(1, N);
}
generated quantities {
real yrep [N] = normal_rng(ones_N * 1, ones_N * 10);
}
```

1 Like

a little clarification on the above example - which is entirely correct -

as this issue just bubbled up again - https://github.com/stan-dev/docs/issues/208

intuitively, one would want `yrep`

to be of type `vector[N]`

just as `y`

is.

however, this doesnâ€™t compile.

the definition of `normal_rng`

in the Stan Functions Reference Manual - https://mc-stan.org/docs/2_23/functions-reference/normal-distribution.html

says:

`R`

`normal_rng`

`(reals mu, reals sigma)`

Generate a normal variate with location mu and scale sigma; may only be used in transformed data and generated quantities blocks. For a description of argument and return types, see section vectorized PRNG functions.

that definition uses type `R`

- in the prng-vectorization section, 10.8.3.3 Return type - it says:

The result of a vectorized PRNG function depends on the size of the arguments and the distributionâ€™s support. If all arguments are scalars, then the return type is a scalar. For a continuous distribution, if there are any non-scalar arguments, the return type is a real array ( `real[]`

) matching the size of any of the non-scalar arguments, as all non-scalar arguments must have matching size. Discrete distributions return `ints`

and continuous distributions return `reals`

, each of appropriate size. The symbol `R`

denotes such a return type.

in short, the vectorized RNG function `normal_rng`

return scalars or `int`

or `real`

arrays. not `matrix`

, `vector`

, or `row_vector`

.

1 Like

That only helps if thereâ€™s autodiff inside the loop that is more efficient outside the loop. Loops themselves are very fast in Stan. That means this

```
real y[N];
for (n in 1:N)
y[n] ~ normal(mu, sigma);
```

is a lot slower than the vectorized form

```
y ~ normal(mu, sigma);
```

tghat doesntâ€™ carry over everywhere. For example,

```
real x[N];
real y[N] = exp(x);
```

isnâ€™t any faster than

```
real x[N];
real y[N];
for (n in 1:N)
y[n] = exp(x[n]);
```

because all the vectorized `exp`

function does is a loop down in C++ just like the loop code.

*Edit: Thereâ€™s no autodiff in transformed parameters or generated quantities, so there wonâ€™t be a speed difference between loops or vectorized operations for those functions. There still will be a big difference between matrix operations and loops. You definitely want to call matrix operations where possible because they have lots of optimizations built in.*