Gamma(.., ..)^2 -> generalized_gamma(?, ?, ?)

Hello,

I have shape and rate parameters of a gamma distribution that seem to be related as some form of rate=shape^2

Now I want to observe if rate followes a generalized_gamma distribution. Howeve my dummy attempts for developing such function failed in test

I found the generalised gamma function in BUGS

these are the attempts

#1
set.seed(1) # for reproducible example
s = 50
r = 3
x <- rgamma(10000,shape=s,rate=r)
stan(
model_code="
functions{
real ggamma_lpdf(real x, real a, real b, real c) {
return( (ca * (log © + log(b)) ) + ((ca - 1) * log (x) ) + (-(b*x)^c ) - log( tgamma(a) ));
}
}
data {
int<lower=1> G;
vector[G] y;
}
parameters { real<lower=0> h[3]; }
model { for(g in 1:G) y[g] ~ ggamma(h[1], h[2], h[3]); }
",
data = list(G=length(x), y=x)
)

#2
set.seed(1) # for reproducible example
s = 50
r = 3
x <- rgamma(10000,shape=s,rate=r)
stan(
model_code="
functions{
real ggamma_lpdf(real x, real a, real b, real c) {
return(log(cb^(ca) * x^(ca - 1) * exp(-(bx)^c) / tgamma(a)));
}
}
data {
int<lower=1> G;
vector[G] y;
}
parameters { real<lower=0> h[3]; }
model { for(g in 1:G) y[g] ~ ggamma(h[1], h[2], h[3]); }
",
data = list(G=length(x), y=x)
)

Thanks

So for starters it’s likely not working because you’re taking logs of hte parameters and your parameters are not defined to be positive. So, check out the section of the manual on constraints and give that a try. Also you want lgamma instead of log(tgamma(a))

Something like:

functions{
real ggamma_lpdf(real x, real a, real b, real c) {
return( (c*a * (log (c) + log(b)) ) + ((c*a - 1) * log (x) ) + (-(b*x)^c ) - log( tgamma(a) ));
}
}
data {
int G;
vector[G] y; 
}
parameters { real<lower=0> h[3]; }
model { for(g in 1:G) y[g] ~ ggamma(h[1], h[2], h[3]); }
", 
data = list(G=length(x), y=x)
)

Here are two that work and are tested versus R’s flexsurv. One of these works better than the other, don’t use the cdf’s/lcdf’s with the develop branch of Stan because gama_p doesn’t work well enough there. If you need to use the cdf’s just ask, I have a branch for this somewhere.

//// ---- standard_generalized_gamma

//' Naive implementation of the generalized Gamma density.
//' @param x Value to evaluate density at.
//' @param alpha Shape parameter.
//' @param beta Scale parameter.
//' @param nu Tail parameter.
real generalized_gamma_pdf_1S(real x, real alpha, real beta, real nu) {
  real d;
  d = nu/(beta*tgamma(alpha/nu)) *
    (x/beta)^(alpha-1) * exp(-(x/beta)^nu);
  return d;
}


//' @describeIn generalized_gamma_pdf_1S Naive implementation of the
//' generalized Gamma log density.
real generalized_gamma_lpdf_1S(real x, real alpha, real beta, real nu) {
  real d;
  d = log(nu) - log(beta) - lgamma(alpha/nu) +
    (alpha-1)*(log(x) - log(beta)) - (x/beta)^nu;
  return d;
}


//' @describeIn generalized_gamma_pdf_1S Naive implementation of the
//' generalized Gamma cumulative density.
real generalized_gamma_cdf_1S(real x, real alpha, real beta, real nu) {
  real d;
  d = gamma_p(alpha/nu, (x/beta)^nu);
  return d;
}


//' @describeIn generalized_gamma_pdf_1S Naive implementation of the
//' generalized Gamma log cumulative density.
real generalized_gamma_lcdf_1S(real x, real alpha, real beta, real nu) {
  real d;
  d = log(generalized_gamma_cdf_1S(x, alpha, beta, nu));
  return d;
}

//// ---- lawless_generalized_gamma

//' Implementation of the generalized Gamma density from Lawless (1992)
//' @param x Value to evaluate density at.
//' @param k Shape parameter.
//' @param mu Location parameter.
//' @param sigma Scale parameter.
real generalized_gamma_pdf_2S(real x, real k, real mu, real sigma) {
  real w;
  real d;
  w = (log(x)-mu)/sigma;
  d = (k^(k-.5))/sigma/tgamma(k) * exp(
    sqrt(k)*w - k*exp(k^(-.5)*w)
  )/x;
  return d;
}

//' @describeIn generalized_gamma_pdf_2S Implementation of the generalized
//' Gamma log density from Lawless (1992)
real generalized_gamma_lpdf_2S(real x, real k, real mu, real sigma) {
  real y;
  real w;
  real d;
  y = log(x);
  w = (y-mu)/sigma;
  d = (k-.5)*log(k) - log(sigma) - lgamma(k) +
    (sqrt(k)*w - k*exp(1/sqrt(k)*w)) - y;
  return d;
}


//' @describeIn generalized_gamma_pdf_2S Implementation of the generalized
//' Gamma cumulative density from Lawless (1992)
real generalized_gamma_cdf_2S(real x, real k, real mu, real sigma) {
  real w;
  real d;
  w = (log(x) - mu)/sigma;
  d = gamma_p(k, k*exp(1/sqrt(k)*w));
  return d;
}

//' @describeIn generalized_gamma_pdf_2S Implementation of the generalized
//' Gamma cumulative log density from Lawless (1992)
real generalized_gamma_lcdf_2S(real x, real k, real mu, real sigma) {
  real w;
  real d;
  w = (log(x) - mu)/sigma;
  d = log(gamma_p(k, k*exp(1/sqrt(k)*w)));
  return d;
}

Thanks I’m currently testing S1 with discrete success (S2 never completed with the following code)

set.seed(1) # for reproducible example
s = 50
r = 3
x <- rgamma(10000,shape=s,rate=r)
(stan(
model_code=“
functions{
real ggamma_lpdf(real x, real k, real mu, real sigma) {
real y;
real w;
real d;
y = log(x);
w = (y-mu)/sigma;
d = (k-.5)*log(k) - log(sigma) - lgamma(k) +
(sqrt(k)w - kexp(1/sqrt(k)*w)) - y;
return d;
}
}
data {
int<lower=1> G;
vector[G] y;
}
parameters { real<lower=0> h[3]; }
model { for(g in 1:G) y[g] ~ ggamma(h[1], h[2], h[3]); }
”,
data = list(G=length(x), y=x)
))

Here the more general picture of what I’m trying to achieve. Maybe there is a fundamental issue with the way I am thinking of my hyerarchical negative binomial process --> Hyerarchical negative binomial process

I’m a bit confused here about the prediction.

s = 50
r = 3
x <- rgamma(1000,shape=s,scale=r)
summary(stan(
	model_code="
	functions{
   real ggamma_lpdf(real x, real alpha, real beta, real nu) {
   real d;
   d = log(nu) - log(beta) - lgamma(alpha/nu) +  (alpha-1)*(log(x) - log(beta)) - (x/beta)^nu;
   return d;
}
	}
	data {
	int<lower=1> G;
	vector[G] y; 
	}
	parameters { real<lower=0> h[3]; }
	model { for(g in 1:G) y[g] ~ ggamma(h[1], h[2], h[3]); }
	", 
	data = list(G=length(x), y=x)
))$summary[1:3, 1]

gives
h[1] h[2] h[3]
34.547457 21.854673 1.523187

when the shape and scale are 50 and 3

A few things to check:

  1. R-hat, autocorrelation, divergences, treedepth, all from multiple chains, easiest via shinystan::lanuch_shinystan(model_object)

  2. Don’t compare parameter estimates when simulating from a gamma and fitting a generalized gamma. Compare histograms simulated from fit parameters to histograms of data or something like that.

  3. The gamma is a limiting case and not necessarily a well behaved one (I don’t recall if the parameterization you’re using is ok for that).

  4. a shape parameter of 30 is very high, high enough that you might have trouble estimating it from a thousand data points, try something like shape 2, scale 1 first

Actually, I just checked. It’s not mixing at all so don’t worry about what the estimates are. Did you try one of the functions I sent?

The one I showed you is one of the functions you sent me

Ok, did you try it with something like shape as 2 or 6 and rate as 1? I’m currently testing these and they usualy work fine but I haven’t tested a grid of values yet. Good to know that there will be ranges where it’s challenging. THe _2S version might work better, for optimizaton it’s supposed to avoid some of the multi-modality issues this density has.

Hello,

yes it seems to work for those values, it converges more slowly/doesn’t converge for more extreme values.

But I’m now realising that what I though was a gamma with high shape (50 < y < 100) is probably another distribution like generalised gamma right skewed.

And actually, for example

y - 50 ~ gamma()

A lack of fit combined with very high shape values could cause the problems
you’re seeing. I’ll check that region when I set up simulations for this.
When I looked at the chains I thought HMC should be fine but maybe the
lgamma or one of the gradients is loosing accuracy—this model gets very
low acceptance rates at times for these large shape values. Glad you’re
finding a simpler solution, I’ll try to remember to update this thread once
I have some better results.

Could you please write down generated quantities for S1? I would like to check one thing.

Thanks!

You mean an RNG?

Yes, so I can check back the prediction

I don’t recall writing one, I would just use a slice sampler, just make sure you work with the log density… sorry, I had a simple R implememtation kicking around but I can’t find it right now. Really a more sensible thing is to work with the RNG from R’s flexsurv package. If you want to see how the parameterizations are related check out this repo: https://github.com/sakrejda/survival-densities

Specifically, here are tests relating flexsurv’s parameterizations to mine, for 1S: https://github.com/sakrejda/survival-densities/blob/master/code/waitup/waitup/tests/testthat/test-generalized_gamma_1-pdf-matches-cdf.R

If you want to check out that package you need to clone the repo and build it using cmake, I haven’t published a build yet. Just comment out the fits subdirectory or it’ll build a lot of other stuff, the manuscripts subdir has an .Rmd file that writes down the math for these densities in detail… I’ll just upload a build when I get the chance but it might not be for a few days.

Thanks,

just looking at convergence the S1 seem to converge no problem

library("flexsurv")
x=rgengamma(1000, mu = 6, sigma = 0.3, 0.4)
hist(x)
fit = stan(
	model_code="
	functions{
real ggamma2_lpdf(real x, real k, real mu, real sigma) {
  real y;
  real w;
  real d;
  y = log(x);
  w = (y-mu)/sigma;
  d = (k-.5)*log(k) - log(sigma) - lgamma(k) +
    (sqrt(k)*w - k*exp(1/sqrt(k)*w)) - y;
  return d;
}
	}
	data {
	int<lower=1> G;
	vector[G] y; 
	}
	parameters {  real<lower=0> o[3]; }
	model { 
	for(g in 1:G) y[g] ~ ggamma2(o[1], o[2], o[3]);
	}
	", 
	data = list(G=length(x), y=x)
)

while s2 seem to not do it (although the parametrization center/variance would be more comfortable for me)

install.packages("flexsurv")
library("flexsurv")
x=rgengamma(1000, mu = 6, sigma = 0.3, 0.4)
fit = stan(
	model_code="
	functions{
real ggamma2_lpdf(real x, real k, real mu, real sigma) {
  real y;
  real w;
  real d;
  y = log(x);
  w = (y-mu)/sigma;
  d = (k-.5)*log(k) - log(sigma) - lgamma(k) +
    (sqrt(k)*w - k*exp(1/sqrt(k)*w)) - y;
  return d;
}
	}
	data {
	int<lower=1> G;
	vector[G] y; 
	}
	parameters {  real<lower=0> o[3]; }
	model { 
	for(g in 1:G) y[g] ~ ggamma2(o[1], o[2], o[3]);
	}
	", 
	data = list(G=length(x), y=x)
)
Warning messages:
1: There were 1362 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems

Cheers

Interesting, they both work for me for at least some parameter values but I haven’t nailed down exactly where they do better/worse in parameter space.

I figured out the problem here. The k/mu/sigma parameterization (as written) has drastically non-linear effects on the shape of the density from the different parameters, and the effects are on drastically different scales. I didn’t finalize this version of the density because it has some other problems for my specific application but I will rethink… hopefully soon but realistically it’ll be at least a week or two. If you have that package you can run this snippet in R to see the problem, play around with the values and you’ll see what i mean:

tp <- expand.grid(x=seq(from=10^-3, to=20, length.out=50), y=seq(from=10^-3, to=1, length.out=50)) %>% dplyr::mutate(z=mapply(function(x, y) waitup:::generalized_gamma_pdf_2S(x=x, k=10, mu=2, sigma=y), x=x, y=y))
ggplot(data=tp, aes(x=x, y=y, fill=z)) + geom_raster()

I’m pretty sure the tests in the package show that the two parameterizations are equivalent in terms of the density that’s ultimately specified but i’ll have to look at the scaling and transformations to get something that’s Stan friendly. You can also parameterize in Stan with k/mu/sigma and then generate alpha/beta in the transformed parameters block and get somewhere that way.

I’m too interested in fitting generalized gamma distribution. Wouldn’t it be possible to do something like

data {
    int<lower=0> N; 
    vector<lower=0>[N] y;
}parameters {
    real b;//rate
    real k;//shape
    real q;//log-exponent
}model {
 pow(y,exp(q))~gamma(exp(k),exp(b));
}

I know this doesn’t work, but is there some other way to extend Stans implementation ofgamma instead of writing a new density function from the scratch?

The hand-coded generalized gamma in Stan actually works ok so this really isn’t worth the trouble. Last time I suggested adding these distributions there was some resistance to including stuff that doesn’t get much use anyway so I don’t know if they’d get in anyway.