A new abstraction around probability distributions?


Just submitting a PR to refactor VectorView and noticed I had to change ~220 probability distribution files in almost exactly the same way, so I was wondering if there might be room for a higher level abstraction on top of those that DRYs up the code inside each distribution a little bit? I imagine you guys have thought about this but curious what the design space here looks like (as I’m still fairly new to C++).

It’d be great if we could remove code duplication. The VectorView is there now so there’s not even more code dup and twisty branching logic.

Forgetting about C++ for the moment, what would you suggest? I don’t think the coding in C++ will be the obstacle if we can figure out how to abstract out the bits you’re worried about (though it may have to wait until real lambdas in C++11).

We could have done a brute-force vectorization of these functions the way we did for unary functions. The reason we didn’t is that we’re being careful to exploit shared computations and not define unnecessary intermediates. And that bit’s all ad hoc on a distribution-by-distribution basis. And it could be made faster than it is now by making it a little more ad hoc (like converting some repeated additions, like of -log_sigma for normals to a single multiplication); so whatever gets done has to leave that path open in the future.

Though having said all that, the computations of intermediate quantities and the final result are all conditioned on subsets of the parameters being constant or not.