Gamma quantile function

sakrejda · December 5, 2016, 8:25pm

I wasn’t too concerned with the speed of the inverse calculation but with auto-diff to get the gradients since the inverse gamma calculation is an approximation to begin with. I don’t think there’s any way to get around that (no published algorithm that I can find that doesn’t invoke something awkward like Meijer’s G function) so I’m going to take your advice. I can test code I write against values from R’s qgamma and gradients against finite diff? If there are other suggestions on testing I’d welcome them.

thel · December 8, 2016, 3:43am

Regarding the derivatives, for any distribution we have icdf’(y) = 1 / pdf(icdf(y)) – this is just the chain rule. So there’s no extra work to get the derivatives, assuming we have a pdf and a good approximation of icdf. If we were willing to do newton searches we could do a generic implementation inverse cdfs with derivatives for all the univariate distributions at once, by implementing inverses for univariate functions. If you need the derivative with respect to the shape parameters, there’s a similar generic computation using the multivariate chain rule. But I do expect a lot of overflows out in the tails.

Bob_Carpenter · December 8, 2016, 6:39am

thel Developer
December 8
Regarding the derivatives, for any distribution we have icdf’(y) = 1 / pdf(icdf(y)) – this is just the chain rule. So there’s no extra work to get the derivatives, assuming we have a pdf and a good approximation of icdf. If we were willing to do newton searches we could do a generic implementation inverse cdfs with derivatives for all the univariate distributions at once, by implementing inverses for univariate functions.

I think this’d be worth exploring. It’d be nice to have
the inverse CDFs.

It’d be nice to expose the actual optimizer and its derivatives,
I think.

If you need the derivative with respect to the shape parameters,

Yes, we need derivatives with respect to everything. At least
with the way functions are architected and documented now.
They can be used anywhere the Stan types match.

Bob

thel · December 8, 2016, 7:26am

In this approach the story with the derivative with respect to a parameter q is to just to write

cdf(q, icdf(q, x)) = x

differentiating with respect to q and writing partial_q for the derivative we find

partial_q cdf(q, icdf(q,x)) + pdf(q, icdf(q,x)) * partial_q icdf(q,x) = 0

so

partial_q icdf(q,x) = - partial_q cdf(q, icdf(q,x)) / pdf(q, icdf(q,x))

again I worry about overflows but I think they’re “true” overflows in the sense that icdf is genuinely badly behaved in the tails (swap horizontal for vertical). But the pattern is the same as before: if I can evaluate the function, its inverse, and the derivatives of the inverse, I can evaluate all derivatives of the function with generic formulae.

The downside of this is of course that these formulas depend on the inverse function relationship being exact, so the “derivatives” would not actually be derivatives of the approximation, nor are they obviously the best approximation to the derivative. I’m starting to doubt it. Looking at those pdfs in the denominator I’m quite worried.

Anyway, a generic enough approach but I’m suspicious of the accuracy. Might be worth trying if we’ve got time to do a lot of testing.

-thel

sakrejda · December 8, 2016, 3:38pm

This all makes sense, I like the idea of getting a newton root solver in and having an implementation for all the inverse cdf’s. I would be interested in working on that sometime in January but won’t have any time for it before that really. In the next week or so I will get Boost’s inverse gamma and derivatives done at least on a branch and I’ll use this idea for the shape derivative so maybe that can be a test case.

vincent-picaud · December 12, 2016, 9:24pm

Maybe there are some ideas to take from this blog post:
http://www.shapeoperator.com/2014/02/22/bisecting-floats/
where some floating point traps in the context of bisection are described

Julia implementation of some methods are there:

This derivative free method looks especially interesting:

sakrejda · December 13, 2016, 1:55pm

That’s fun stuff! I hadn’t run into such a clear example of the discreteness of floating point numbers yet.

vincent-picaud · December 13, 2016, 6:04pm

That’s fun stuff! I hadn’t run into such a clear example of the discreteness of floating point numbers yet.

That is also the case for me, the example is quite illuminating.

I have not implemented New Eighth-Order Derivative-Free Methods for Solving Nonlinear Equations yet. To be honest I fear a little bit about numerical stability. However if it works in a stable way, the 8th order of CV must be impresive.

Bob_Carpenter · December 13, 2016, 7:52pm

The usual place we run into it is that 1 - epsilon evaluates
to 1 for any epsilon with -1e-15 << epsilon << 1e-15 or so.
There’s much better resolution around 0, hence the need for
complementary CDFs.

jstag · July 1, 2020, 7:42pm

Were you ever able to incorporate Boost’s inverse gamma as you mention here? I have been struggling to do the same.

Topic		Replies	Views
Gamma distribution quantile and inverse quantile function General rstan , techniques , fitting-issues , specification , math	46	4996	June 18, 2021
Quantile function of gamma distribution Modeling rstan , techniques , fitting-issues , specification , math	1	497	September 15, 2021
Using Inverse-CDFs and Quantile-Parameterized Distributions as Priors Modeling	2	290	March 4, 2024
Student T cumulative distribution function General	8	1196	February 2, 2024
Inverse Lognormal CDF Developers	2	343	September 30, 2023

Gamma quantile function

Related topics