After fitting my model on n=~3000 rows, I compute predicted values for about ~250 different permutations of covariates (counterfactual scenarios). With 4 chains and 3000 steps each, this inflates the dimensionality of the generated quantities matrix for the fitted values to 3000 * 250 * 12000 = 9 billion entries. I have observed that exporting high-dimensional generated quantities significantly slows down the time between finishing of chains and completion of the rstan sample-function.

I only really need the posterior average residual for each permutation, so a 250 element vector with the per-step mean of the mean of rows (rather than the 9 billion matrix) would be sufficient to be exported.

What would be a recommended workflow if one only needs point estimates of generated quantities rather than the full posterior generated quantities of each step? Exclude full matrix from export with “{…}” and calculate the averages in the generated quantities section? Or “manually” calculate predictions based on mean posterior parameter outside of stan? Does the gen-quant calculation only take place after all steps of a chain are done? Or after each step?

One key element where I am missing understanding to inform my decision is whether the long additional run time with large-dimensional gen-quant exports is due to their calculation in the gen-quant section or due to the export of the stan model object back to R.

I would be very grateful for any advice and am happy to supplement a reproducible example, although I believe the question might be meta enough not to require an example.

Thank you!