I would like to speed up a model. I have a data frame that is not square, that I can easily represent by a list (number of columns and rows are not fixed, but decided as input)

Now, the way I am sampling

s1 ~ normal()
s2 ~ normal()
…
sN ~ normal()

is building a linear array and a map array where I map every element with a parameter (it could be an vector of size N)

This is really slow because is not vectorized. Is there a solution/trick to vectorize over columns (in my example). My understanding is that lists/tuples haven’t been implemented yet.

As far as I understand it the most time consuming bit in running Stan is calculating the derivatives of the probability functions. The calculation are being done by going over an expression graph which represents the function and its derivative. When you have normal_1, normal_2, …, normal_n in your model, the expression graph needs to be constructed n times. When you have normal(vector_mu, vector_sd), the expression graph only needs to be constructed once.

share repeated computations from a scalar being broadcast; so if you have just a scalar sigma, you only need to compute log(sigma) once, and

reduce the size of the expression graph, which has an even bigger speedup on derivatives, which are the true bottleneck; this also has the pleasant property of converting a lot of virtual operations into statically locatable operations, which is also a big win.

We also drop any additive constants in the log density (they’re not needed for sampling).

It will give you a speedup even when both the location and scale vary. The speedup comes from reducing the number of virtual function calls during autodiff and the reduced memory allocation. It’s not as big a speedup as when the scale is a simple scalar (rather than a container), because the logarithm is the most compute intensive part of the normal log density.