Output state variables without repeating code in generated quantities

As I understand it, to examine model predictions in stan output that are not estimated parameters defined in parameters{} block, the variables have to be calculated in the generated quantities{} block. The issue here is that some of the code used in the model{} block has to be repeated in generated quantities{} block. This is OK for simple calculations but a bit tedious for more complex ones. It doesn’t appear that variables calculated in the model block are available in the generated quantities block (i.e., they are local).

Is there a way to make variables in model{} global so they are available in generated quantities{}?
In ADMB, one can declare a variable type (e.g., sdreport_vector) which lets the program know to include the variable in the standard output (e.g., in the correlation matrix) after the model has converged (thereby providing the same computational efficiency as generated quantities{}). As there are some similarities between stan and ADMB (e.g., the code blocks, C++ code generator), I’m wondering if this type of feature has been implemented in stan,

If not, the only work-around I can think of is to declare as much of the code in the model{} block as a function, so the function can be called from both model[{} and generated quantities{} blocks. Is this your recommendation?

Thanks for your assistance!

1 Like

If you construct the variables in the transformed parameters block, then they will be available in both the model and generated quantities blocks

2 Likes

Thank you so much! I was taking the name of the code block ‘transformed parameters’ to literally. I now understand that it can include transformed parameters as well as predicted states that depend on raw or transformed parameters as well as input data. Thus the model code block becomes very short and is essentially just priors, hyper-priors, and likelihoods.

Only minor issue in my case is that there will be some temporary variables (largish matrix) in transformed parameter block I don’t care about and thus don’t want to incur a posterior file size cost or processing time cost to write that info. Does the order of operations between transformed parameters{} and model{} blocks matter? If possible I would like to compute these temporary variables in model{} block that are needed as input to some of the calculations in transformed parameters{}. Not a big deal in my case if this can’t be done.

Cheers,

Josh

To use “temporary” variables in the transformed parameters block (or in any block for that matter), you can just wrap that section of the parameter construction in {} to create a new local scope:

transformed parameters {
  real global_var;
  {
    // Variables declared in this scope are local, and not saved
    real local_var = 5;

    // Any modification/construction of global variables still persists
    global_var = local_var + 1;
  }
}
2 Likes

Great. You folks have thought of everything! This issue could be significant if the temporary variables require large matrices or arrays.

Thanks so much.

Josh

Just a small comment – the transformed parameters and model block are evaluated every time the target gradient is evaluated, which in general will be many times per Markov transition, but the generated quantities block is called only once per Markov transition. Moreover the generated quantities block is evaluated without the overhead of automatic differentiation and so tends to be cheaper than equivalent calls in the transformed parameters and model blocks.

All of this means that reevaluation code in the generated quantities block is rarely a nontrivial computational burden. Duplicated code is always a potential source of bugs but the cost shouldn’t be too much of a consideration.

1 Like