Through tutorials and searching on the Internet, I’ve seen models include the ODEs being solved in the ‘transformed parameters,’ ‘model,’ and ‘generated quantities’ blocks. I am curious to know which of these is best, and which conditions would favour one over the others. Thank you!
Currently, I am working with linear systems of ODEs, and data that has dimensions 2e4 x 30, but because I am new to Stan, I would like to know about recommendations for other use cases as well.
Here is a little algorithm to determine the answer for your question:
-
Does the ODE function or initial values depend on parameters? If not, you should do it in transformed data
or outside Stan and give the solution to Stan as data
. That way you won’t waste computation doing it on each MCMC iteration. If yes, go to 2.
-
Does the log probability of your model depend on the ODE solution? If not, you can do it in generated quantities
. That way you won’t waste computation doing it on each point of a HMC trajectory (i.e. several times per one MCMC iteration). If yes, go to 3.
-
If you ended up here, then you need to do the solve either in transformed parameters
or model
. One difference is that if you do it in transformed parameters
, the ODE solutions for each posterior draw at each output point will be accessible like any other parameters after sampling.
In generated quantities
you can do also additional parameter-dependent ODE solutions that you want for each parameter draw but are not needed to compute the log probability of the model. This is useful if you for example want ODE solutions at a denser time grid than your data. The solutions done in generated quantities
can also be accessed after sampling.
One thing that I am not sure about (and someone else can answer) is the possible difference in memory footprint between doing the solve in transformed parameters
vs. model
, especially in this case where the system is very large.
3 Likes