I am trying to fit a sum over several functions with Stan and was wondering if the vectorized approach would be more efficient.
In practice this would come down to the following models:
sum_observables ~ normal(function_1(...)
Here sum_observable is simply the sum over observable_1, observable_2 …
And covariance matrix is simply the corresponding covariance_like standard deviation matrix.
observable_1 ~ normal(function_1(...), sigma_observable_1);
observable_2 ~ normal(function_2(...), sigma_observable_2);
observable_3 ~ normal(function_3(...), sigma_observable_3);
Although the functions you are computing are the same and the likelihood looks similar, you have very different models there.
If you have access to separate observables, aggregating that data will usually result in less information to the model parameter, because you are allowing a linear combination of the functions that may make them (and the parameters) non identifiable.
On the other hand, if your covariance matrix is not diagonal, in your formulation you wouldn’t be able to estimate the covariances (but you could still have a covariance matrices with shared parameters across the different observables).
If your specific model formulation will prevent identifiability issues (I don’t see exactly how it would, but it may be possible), then it gets down to efficiency, but I’d guess in that area there wouldn’t be much of a difference – I’d just try the different versions if I really needed any improvement, otherwise I wouldn’t bother.