Right. Thanks, @syclik, @Bob_Carpenter! That’s very clear.
So, if I understand correctly, the interpretation of conditionals like this is OK because they would only cause non-differentiability at a set of measure 0, meaning that we never encounter those points in running HMC?
While constructs, I find a little more difficult to interpret. I guess the assumption is that at any point, the while loop terminates after finitely many passes, resulting in a finite expression tree which we can just differentiate? I guess one difference between Stan and stuff that people might ultimately want to do with a general purpose probabilistic programming language (potentially) is write programs that are actually non-terminating (like servers) but which exhibit intelligent probabilistic behaviour (for instance, do smart crypto stuff). I’m just speculating here, of course, but I’m trying to see how we should construct a semantic framework for understanding probabilistic programming. Another related thing I’d be slightly worried about with while constructs, more so than case constructs, is that we can define densities that become non-differentiable on sets of non-zero measure. (I think of the use of while loops as defining countable sums of densities. Is this right?) I guess it’s just the responsibility of the programmer not to write code like that.
I guess this points to the following rule for differentiating while loops:
Let f be defined through a while loop.
- If f terminates for input u, expand the while loops and differentiate as usual.
- If f diverges at u, so does f’.