Moving transformed parameters to reduce_sum

I have a massive transformed parameters block that I wish to move into reduce_sum. Based on this tutorial, it should be possible (quote - “For Stan programs this will usually mean that code in the transformed parameters and model block will be moved into the partial sum function”). However, when I declare my parameters I get the error No transformations/constraints allowed. Also from my understanding, parameters that are declared outside of the parameter blocks are not saved to the final output. Is there a way around it?
Thanks!

Hi,
I don’t think we can help you much without seeing the code that triggerred the error. I would guess that the parser is complaining about declarations like:

real my_fun() {
  real<lower=0> a;
  ...
}

Where a constraint on parameter is used for variables that are not at the top level of parameters / transformed parameters / generated quantities. This is because Stan currently cannot and does not enforce constraints for variables defined elsewhere. The quick solution is to remove the constraints - you then obviously have a bit more responsibility to make sure your code always generates values that satisfy the constraints.

Best of luck with your model.

2 Likes

Yes, this is what I tried to do. Thanks so much! I came to the solution where I recalculate the parameters in the generated quantities block and this saves them to the final output. Just to be sure – there is no difference in declaring parameters in parameters block or in the model block (expect for the constrains issue), right?

Yes, that would be the recommended approach. Note that generated quantities are evaluated once per iteration while model / transformed parameters once per gradient evaluation, which may be many per iteration. Also in generated quantities the gradient is not evaluated, so all in all the penalty for computing something in gen. quants. is much lower than in model.

Yes, no constraints + stuff declared in model block is not saved in the output.

Best of luck!

This is a good point that I haven’t considered.

So if in the model I have, for example:

    real criterion_cd_pr_std[N,W]; 
    real criterion_cd_pr[N,W]; 
    real criterion_cd[N,W];
    real X[N,W,Xdim];

     for (s in 1:N) {                
        for (w in 1:W) {

            criterion_cd_pr_std[s,w] ~ std_normal();
            
            if (w == 1) {
                    X[s,w,] = to_array_1d(X1[s,]); 
            } else {
                    X[s,w,] = to_array_1d(A * to_vector(X[s,w-1,]));
            }  
            
            criterion_cd_pr[s,w] = (C * to_vector(X[s,w,]) + sigma_r*criterion_cd_pr_std[s,w]);
            criterion_cd[s,w]= exp(criterion_cd_pr[s,w]);   

	    choice_cd[s,w] ~ bernoulli(inv_logit(criterion_cd[s,w]));
         

and in generated quantities I have the mirroring calculations of criterion_cd:

    real criterion_cd[N,W];
    real criterion_cd_pr[N,W];
    real X[N,W,Xdim];

     for (s in 1:N) {                
        for (w in 1:W) {
            
            if (w == 1) {
                    X[s,w,] = to_array_1d(X1[s,]); 
            } else {
                    X[s,w,] = to_array_1d(A * to_vector(X[s,w-1,]));
            }  
            
            criterion_cd_pr[s,w] = (C * to_vector(X[s,w,]) + sigma_r*criterion_cd_pr_std[s,w]);
            criterion_cd[s,w]= exp(criterion_cd_pr[s,w]);           
        }
    }

So the criterion_cd parameter that I fit the bernoulli with is different when I calculate it in the generated quantities block and when I actually fit the model?

edits - fixed errors in the example and simplified it.

Not really “different” it is just that the model block is evaluated for each transition until a U-turn is detected and gen. quants only once per iteration (i.e. after U-turn is detected and one point on the whole trajectory is accepted). The samples you get from the model are the accepted points from each trajectory (i.e. one per iteration). However, since it is the same code with the same input numbers, the result for the accepted point during computing the trajectory and the ones computed in gen. quants should match (unless there is a bug in Stan).

My point was primarily about efficiency: computing something in gen. quants tends to be much cheaper than computing the same in model / tranformed parameters, so even if you end up having a duplicated code in gen. quants, it is unlikely to affect performance much.

Hope that clarifies more than confuses.

1 Like

It makes sense. Thank you!