Order of the pade approximation with matrix exponential

charlesm93 · February 22, 2018, 4:33pm

I thought I’d share this on discourse before submitting an issue on Github.

The matrix exponential function in Stan is based on Eigen’s implementation and uses a Pade approximation (see Molver and Van Loan, 2003 for a good description of the algorithm). A while ago, I tried monitoring the order of the approximation and found that, unless we’re working on simple matrices, the order of the approximation goes to 13 – which leads to rather intensive matrix operations. Another package, expokit, uses an approximation of order 6 (Sidje, 1998). This might be something we want to investigate. Do we need such a high order, what are the pros and cons, should we give the user control, etc.

If there is interest, I’ll post an issue on github.

aaronjg · February 23, 2018, 6:28pm

Would it be possible to use a lower order approximation for the leapfrog steps and a higher order for the final accept/reject choice? My understanding is that this sort of thing is possible in HMC, though others here certainly know more about this than I do.

My experience with the matrix exponential is it can be difficult to find one implementation that works for all problem types. The Moler Van Loan paper you cited has a good discussion about this. The R expm package gives users control over which method and order which seems to work pretty well.

anon75146577 · February 23, 2018, 6:35pm

low accuracy could mess with the hamiltonian trajectories, so it’s probably not a great idea. This will be a challenge for very non-normal matrices. I’d say a tunable parameter in the call would be a good idea with high-order by default.

(If the matrix is SPD, the problem is much easier and you can get a very cheap, accurate approximation using Trefethen’s Caratheodery-Fejer method. But if it’s a general matrix, a combination of Pade and Scaling and squaring is probably unavoidable. Nick Higham has some more recent papers on this than the van loan one. He also did some stuff on the “condition number” which is really computing the Frechet derivative, which should be useful for autodiff)

Bob_Carpenter · February 27, 2018, 7:37am

That would be very hard with Stan’s current architecture, which doesn’t separate these two calculations.

There is no final accept-reject step in NUTS, only in static HMC. The original NUTS used slice samplig with a bias toward the last doubling, whereas the current version in Stan uses a multinomial version introduced by @betanalpha.

We can have more than one.

Topic		Replies	Views
Examples where numerical solver is faster than matrix exp Algorithms	7	964	February 3, 2019
Matrix exponential function Modeling	4	1416	July 8, 2019
Numerical error tanks stepsize, other possibilities? Developers	21	2007	March 16, 2017
Matrix_exp and pre 2.13 Stan for RStan 2.13 Developers	1	524	November 18, 2016
Schur decomposition Developers features	7	1323	November 4, 2016

Order of the pade approximation with matrix exponential

Related topics