Prediction in generated quantities vs. model parameters

mgrab · September 3, 2020, 8:08pm

Hi All,
More of a general question about predicting based on new observations in Stan. I see from the users guide (via the link below) that in addition to using generated quantities to make predictions, it is also possible to do this in the model block, where the predictions become parameters. I was wondering what the plusses and minuses are of making predictions either way - is it just using the generated quantities block to make predictions means that there are now n (# of predictions) less parameters Stan has to estimate?
Thanks,
MG

caesoma · September 4, 2020, 4:15pm

The basic difference between the model and transformed parameter blocks is that the former will not store the calculations made using the model parameters and the latter will return them with the traces of the original parameters. Otherwise, both can be used for the calculations needed for the HMC chain.

generated quantities is a bit different, it cannot (I believe) be used to compute the above, it only generates some arbitrary output using the samples from the chain. This has a few advantages: you don’t have to do any intermediate calculations, only that with the actual sample from the chain; because of that you can compute a forecast or any model output that doesn’t match the exact size of the data (if you do that the other blocks you will have to make sure which slice of the prediction actually matches the data); and finally, you can use random number generators in this block, which is not possible in the others (so you can for instance generate a stochastic prediction or ensemble of stochastic trajectories).

Whatever you choose, none of this affects the number of parameters has to estimate (only the number of intermediate calculations and of which kind). Also, I’d be careful with what is meant by prediction, because it is sometimes used as a synonym for forecasting (or backcasting), and sometimes only as the output you get from the model given a set of parameters.

mgrab · September 5, 2020, 8:47pm

Hi thanks for the help.

I’m not really forcasting or backcasting as I am predicting a new value through a linear model which was trained in Stan using a separate set of data

My question really is (and if you answered it excuse me but I must have missed it), looking at the Stan page on Prediction, (see link), what are the benefits of making predictions in the model block (predictions modeled as parameters) rather than the generated quantities block (predictions declared as generated quantities)?

Thanks,
MG

caesoma · September 7, 2020, 4:52am

Except for the random numbers that can be generated in the generated quantities block it’s mostly a matter fo convenience: if you have no use for the transformation of parameters after inference it can go in the model block, if you need to store it, it may go in the transformed parameters.

If the data is separate but in the same support as the data used for inference you can use the same linear model computed for the transformed parameters, e.g. y = ax + b where x = \{0,1,2,3,4\}, but if x is different between data sets it may be best to use the generated quantities, but again, it’s a matter of organization and convenience for the most part.

mgrab · September 15, 2020, 9:13am

Thank you very much, I think that clears up my questions.
Best,
MG

Topic		Replies	Views
Predictions by using the generated quantities block in Stan General	7	6042	March 15, 2018
Generated quantities for prediction Modeling specification , posterior-predictive	9	544	June 5, 2024
Where to put predictions General rstan	1	353	November 30, 2020
Posterior Predictive Checks After Sampling Modeling	3	812	October 23, 2022
Specifying the number of samples for rng Modeling	6	2412	November 3, 2017

Prediction in generated quantities vs. model parameters

Related topics