In order to evaluate a particular log density that I’m interested in, I have to construct Kronecker product matrices. I know the tricks that allow one to avoid actually doing this, but it is an unusual situation and they don’t apply. I’ve written my own Kronecker product function in Stan. If I just look at the computational complexity of the operations required to evaluate the log density, the Kronecker products are not the most expensive part. I take a Cholesky decomposition which will be O(p^3) whereas the Kronecker products will be at most O(p^2), where p is the dimension having the greatest influence on the computational burden. However, I know that Stan has to evaluate gradients, as well. For the Cholesky product, the gradients are built in and thus computed relatively efficiently, whereas for the Kronecker product function I’ve written, they aren’t built in and may well be computed very inefficiently. How reasonable is it to think that computing gradients through my homemade Kronecker function might be slowing me down a lot relative to how things would be if there were a built in Kronecker function?
I know that I should profile my code and see that it is in fact the Kronecker products that are taking up a lot of time. I’m working on figuring out how to use the profiler.