# Priors for hierarchical Gamma - Gamma model (Extended Pareto distribution)

Hi everyone,

I hope you are doing well in this context. I have the following code for fitting an extended Pareto distribution, which can be built as a mixture of a Gamma and a rate parameter Gamma distributed.
The resulting density is denoted as

f(y)=\frac{\Gamma(\alpha+\theta)}{\Gamma(\alpha)\Gamma(\theta)} \frac{1}{\beta} \frac{(y/\beta)^{\theta-1}}{(1+y/\beta)^{\alpha+\theta}}

where \theta is the shape of target Gamma and \alpha and \beta are the shape and rate of the mixing Gamma. Moreover,

E(Y)=\frac{\theta \beta}{\alpha-1}\\ sd(Y)=E(Y) \sqrt{\frac{\alpha+\theta-1}{\theta(\alpha-2)}}

A simple model,

data {
int<lower=0> N;
vector<lower=0>[N] y_i;
}
parameters {
real<lower=0> theta[N];
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> mu;
}
model {
// target += normal_lpdf(alpha| 2,0.5);
// target += normal_lpdf(mu| 10,2);
// target += normal_lpdf(gamma| 100,50);
target += gamma_lpdf(theta| mu,gamma);
target += gamma_lpdf(y_i| alpha,theta);
}
generated quantities{
real yrep[N];
for (n in 1:N){
yrep[n] = gamma_rng( alpha, gamma_rng(mu,gamma));
}
}


I tested it with the following simulated data: rgamma(shape= 2 , rate = rgamma(n,shape= 10, rate = 100), n=10000)

How would you generally propose appropriate priors for this case?

With these priors
target += normal_lpdf(alpha| 2,0.5);
target += normal_lpdf(mu| 10,2);
target += normal_lpdf(gamma| 100,50);

But with these
target += normal_lpdf(alpha| 0,5);
target += normal_lpdf(mu| 0,25);
target += normal_lpdf(gamma| 0,250);

2 Likes

You have

which suggests you’re trying to estimate N + 3 parameters from N observations. I tried letting \theta be a scalar, but got into serious sampling problems. This doesn’t seem to be a very easy model to fit.

Yes. I’ll try to generate samples for lower values of \mu (the shape of the mixing) and see what happens.
Generally, are the realizations of \theta considered as actual parameters?

“If you don’t see it, it’s a parameter”, Statistician, Bayesian. ;-)

2 Likes

I am trying this now, but the posteriors are strange. I may have some error in the formulae.

data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] y_i; //T
}
parameters {
real<lower=0> theta;
real<lower=0> alpha;
real<lower=0> gamma;
}
model {
target += normal_lpdf(alpha| 2,0.5);
target += normal_lpdf(gamma| 10,5);
target += normal_lpdf(theta| 100,30);

target += log(lbeta(alpha,gamma))-log(theta)+(alpha-1)*log(y_i/theta)-(alpha+gamma)*log(1+y_i/theta);

}
generated quantities{
//real yrep[N];
}


f_{X_i}(x)=\int_{\beta} f_{X_i}(x|\beta)f(\beta) d\beta\\ f_{X_i}(x)=\int_{\beta} \frac{\beta ^ \alpha}{\Gamma(\alpha)} x^{(\alpha -1)} e^{- \beta x} \frac{\theta ^ \gamma}{\Gamma(\gamma)} \beta ^{(\gamma -1)} e^{- \theta \beta} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1) \theta ^ \gamma}}{\Gamma(\alpha) \Gamma(\gamma)} \int_{\beta} \beta ^ \alpha e^{-\beta x} \beta ^{(\gamma -1)} e^{- \theta \beta} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1) \theta ^ \gamma}}{\Gamma(\alpha) \Gamma(\gamma)} \int_{\beta} \beta ^ {\alpha+\gamma -1}e^{-\beta (x+\theta)} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1)} \theta ^ \gamma}{\Gamma(\alpha) \Gamma(\gamma)} \frac{\Gamma(\alpha + \gamma)}{(x+\theta)^{\alpha + \gamma}} \int_{\beta} \beta ^ {\alpha+\gamma -1}e^{-\beta (x+\theta)} \frac{(x+\theta)^{\alpha + \gamma}}{\Gamma(\alpha + \gamma)} d\beta\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)} \theta ^ \gamma}{(x+\theta)^{\alpha + \gamma}}\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)} \theta ^ \gamma \theta ^ \alpha}{(x+\theta)^{\alpha + \gamma} \theta ^ \alpha}\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)}}{\frac{(x+\theta)^{\alpha + \gamma} \theta ^ \alpha}{\theta ^ {\alpha + \gamma}}}
f_{X_i}(x)=\frac{\Gamma(\alpha+\gamma)}{\Gamma(\alpha)\Gamma(\gamma)} \frac{1}{\theta} \frac{(x/\theta)^{\alpha-1}}{(1+x/\theta)^{\alpha+\gamma}}
2 Likes

Shouldn’t this be just lbeta(...)? You may also be missing a minus sign there.

Hello, I got back to this. I missed the minus, but isn’t it log(lbeta)? Should I apply any kind of transformation to improve the sampling?

lbeta returns the natural logarithm of the beta function.

1 Like

I corrected those mistakes but still having problems with the sampling. Is it related to the tunning?

The log-increments are defined as

log(f(x_i)) = -log(B(\alpha,\gamma))+log(\theta)-(\alpha-1)\times log(x_i/\theta)+(\alpha+\gamma)\times log(1+x_i/\theta)
data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] X; //
}
parameters {
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> theta;
}
model {
target += normal_lpdf(alpha| 2,0.1);
target += normal_lpdf(gamma| 10,1);
target += normal_lpdf(theta| 100,5);

target += -lbeta(alpha,gamma)+log(theta)-(alpha-1)*log(X/theta)+(alpha+gamma)*log(1+X/theta);

}
generated quantities{
//real xrep[N];
}


There were 42 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmupThere were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-lowExamine the pairs() plot to diagnose sampling problems
The largest R-hat is 2.85, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hatBulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-essTail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

There’s a conspicuous absence of a loop in the line incorporating the likelihood. Also, just in case these are caused by numerical issues, try this implementation:

for (i in 1:N)  target += -lbeta(alpha,gamma) + log(theta)-(alpha-1)*( log(X[i]) - log(theta)) + alpha + gamma)*log1p_exp(log(X[i]) - log(theta));


You might also move this line into its own function so you can re-use terms like log(theta) and log(X[i]).

I am not sure a loop is necessary, I was using a vectorized notation. Anyways, I tried that way also with the transformations you suggested, but didn’t succeed.
Isn’t it the shape of the beta function causing trouble in the likelihood?
This density is challenging…

I am using the fact that

X_i= \frac{\theta \alpha}{\gamma} \frac{G_{\alpha}^{(i)}}{G_{\gamma}}

where G_{\alpha} and G_{\gamma} are rv’s gamma distributed with mean 1. Then X_i is an Extended Pareto as we defined before.
I am defining those two latent variables and the sampling works, but I still have problems with wide priors. This case seems to be similar to the mixture.

data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] X; //
}
parameters {
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> theta;
vector<lower=0>[N] GA;
}
transformed parameters{
real<lower=0> GG;
for (i in 1:N){
GG= (theta * alpha ) * (GA[i] / X[i]);
}
}
model {
target += normal_lpdf(alpha| 2,0.1);
target += normal_lpdf(gamma| 10,0.1);
target += normal_lpdf(theta| 100,5);

target += gamma_lpdf(GA|alpha,alpha);
target += gamma_lpdf(GG|gamma,1);
}
generated quantities{
//real xrep[N];
}