Gamma process - is this feasible?

Hello,

I am trying to developed a gamma generating process

negative_binomial -> gamma -> hyper_prior

and I found this proposed solution. Has someone approached this process before? Do you think there is the foundation for creating (myself) a custom density function in stan

Few useful links:

All those functions are built into Stan, so you can just write your own _lpdf as a Stan function. It’d be more efficient to do it in C++ with vectorization and analytical gradients, but it shouldn’t be bad encoded directly in Stan. You might be able to code it up in terms of sufficient statistics to make things faster.

No, I don’t have experience with this distribution.

Thanks Bob,

I don’t need to add the normalisation constant right?

I will implement in Stan. @sakrejda do you have any general design suggestion before I dive into that?

Nope, I haven’t tried something like this so my only suggestion is to make sure to log and simplify on paper before you try to code it (sort of a duh, but I thought I’d say it, never write a density that has a tgamma in it) but it looks like fun stuff. What kind of data are you using this for?

It is gene expression (counts) data. And the process follows nb -> gamma ->
hyperprior with shape and rate unknown. A nightmare so far.

Do you have anything to say about sufficient statistics looking at the book
image i posted (page 25). I’m quite blank on that.

Can you be clearer about which part is confusing? It seems pretty clear to me but maybe I’m missing the point (?) The sufficient statistics S and P are defined on p. 25, once you log you get:

(a-1) * sum_i=1^n (log(x_i)) - log b - log (sum_i=1^n x_i) - n (lgamma(a) - a * log(b)) 

That makes it pretty efficient.

(shouldn’t it be [b*sum(x)] instead of [log b - log (sum_i=1^n x_i)] ? )

Just to make sure I understand:

the equation 96 is should be used in my case instead of gamma as prior for the mean of the negative_binomial_2, and the equation 97 should be used as prior of alphas and betas from the equation 96

negative_binomial_2[n,i] -> 96[i] -> 97

OR

negative_binomial_2[n,i] -> gamma[i] -> 96

(Yes you’re right)

yes

I don’t understand what you’re asking. (?)

Thanks,

simply this

negative_binomial_2[n,i] -> 96[i] -> 97

is same as

“the equation 96 is should be used in my case instead of gamma as prior for the mean of the negative_binomial_2, and the equation 97 should be used as prior of alphas and betas from the equation 96”

The alternative is

negative_binomial_2[n,i] -> gamma[i] -> 96

being a gamma used as prior for NB and 96 used for prior for gamma, but I realise now that this doesn’t make sense. :)

You need the normalization constant if it depends on parameters. You can drop additive terms in the log density that only depend on data and literals (that is, they don’t depend on parameters).

Thanks a lot.

That was a good call

@Bob_Carpenter do you have any suggestion on how to approach this normalising part? Integral of a function including gamma is a bit beyond my expertise…

You could try to throw this integral onto Mathematica.

Last option is to use the ode integrator in Stan to calculate the constant using numerical integration - won’t be fast but it can be done.

Good point about Mathematica, I recently did some similar ones so you’re likely to end up with some incomplete gamma functions. If you do, those are gamma_p in Stan and you likely will need a bugfix that’s in-progress.