Generated quantities from model block (i.e. not storing transformed parameters or computing predictions twice)

twistedmersenne · May 31, 2018, 3:03pm

I need to save the model output for each iteration, but the intermediate calculations consist of large matrices, so I can’t have all that in the transformed parameters block without blowing up all the RAM many times over.
Computing them again from the primitive parameters in the generated quantities block should duplicate the long calculations and hurt performance.

The only other thing I could think of was writing a new function that uses the intermediate functions, and outputs only the quantities I need in the transformed parameters block, but that could make the stan model quite unreadable.
EDIT: there’s an additional problem with this last approach, because some “transformed parameters” are computed once per iteration and used to produce multiple outputs, so this would also increase the number of calculations.

Is there a simple way of pulling quantities in the model block for the generated quantities besides those?
Alternatively, is it possible to store only some of the transformed parameters?

aaronjg · May 31, 2018, 6:23pm

What Stan interface are you using? Why would storing them in transformed parameters blow up ram, but in generated quantities not?

As an aside, generated quantities are only computed at the final point, rather than along the whole trajectory, so repeating calculations here may not be as big a performance hit as you think.

twistedmersenne · May 31, 2018, 8:37pm

I’m using PyStan. I couldn’t use generated quantities, because it apparently cannot use data from the model block.
I need only the last quantity calculated e.g. f_i(x) = L \cdot \tilde{f_i} , where \tilde{f_i} are parameters, L = cholesky_decompose(K) and K needs to be computer from the hyperparameters which goes on to y ~ poisson(f).

aaronjg · May 31, 2018, 9:34pm

It looks like pystan has a “pars” argument, where you can specify which parameters to keep.
https://pystan.readthedocs.io/en/latest/api.html

You can then move the intermediate quantity L into the transformed parameter block, and then not save it.

twistedmersenne · June 1, 2018, 10:04pm

That’ll probably do the trick, thanks. I could verify it all the way through because the stan program broke some some other reason, but whether I pass the pars argument with a list of strings to the sampling command or not makes the computer load ~1GB instead of ~30GB, so I assume it’s working.

betanalpha · June 12, 2018, 8:36am

Unless your trajectories are only a few steps each recomputing everything in the generated quantities will be a negligible cost. The generates quantities are computed with doubles and not vars and hence are faster than even a single gradient evaluation. Additionally you can make use of user-defined functions to avoid code-replication in the recomputations.

Topic		Replies	Views
Output state variables without repeating code in generated quantities Modeling specification	5	431	August 16, 2022
Reusing model and fit object to run generated_quantities block General specification	1	490	October 28, 2021
Generated Quantities for prediction data General	6	1095	July 28, 2020
How to run generate quantities with PyStan PyStan techniques , cmdstanpy	7	601	September 29, 2023
Running stan in R with only generated quantities RStan	7	2470	May 27, 2021

Generated quantities from model block (i.e. not storing transformed parameters or computing predictions twice)

Related topics