Modelling spatiotemporal random effects results in

I think (though my understanding of the autodiff is a bit foggy, so would be great if somebody more knowledgable could confirm - maybe @stevebronder ?) that all autodiff variables are put on a stack as they are created (including all intermediate expressions). Then, for the reverse pass, the whole stack is traversed to propagate the gradients. Stan actually has no idea which expressions involving autodiff variables ended up contributing to the target and which did not. In that sense this is IMHO expected behaviour.

EDIT: Yeah, this is apparently the case, looking at the code at math/grad.hpp at b9944754f973de6638ce21105a98a6d490e3893f · stan-dev/math · GitHub

I also think that since the parser moved to OCaml there are plans to actually do some optimizations and other tricks over the parsed program which could lead to stuff like recognizing that x - x is always zero, but I don’t think there’s anything like this in the current release.

2 Likes