# Priors for hierarchical Gamma - Gamma model (Extended Pareto distribution)

Hi everyone,

I hope you are doing well in this context. I have the following code for fitting an extended Pareto distribution, which can be built as a mixture of a Gamma and a rate parameter Gamma distributed.
The resulting density is denoted as

f(y)=\frac{\Gamma(\alpha+\theta)}{\Gamma(\alpha)\Gamma(\theta)} \frac{1}{\beta} \frac{(y/\beta)^{\theta-1}}{(1+y/\beta)^{\alpha+\theta}}

where \theta is the shape of target Gamma and \alpha and \beta are the shape and rate of the mixing Gamma. Moreover,

E(Y)=\frac{\theta \beta}{\alpha-1}\\ sd(Y)=E(Y) \sqrt{\frac{\alpha+\theta-1}{\theta(\alpha-2)}}

A simple model,

data {
int<lower=0> N;
vector<lower=0>[N] y_i;
}
parameters {
real<lower=0> theta[N];
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> mu;
}
model {
// target += normal_lpdf(alpha| 2,0.5);
// target += normal_lpdf(mu| 10,2);
// target += normal_lpdf(gamma| 100,50);
target += gamma_lpdf(theta| mu,gamma);
target += gamma_lpdf(y_i| alpha,theta);
}
generated quantities{
real yrep[N];
for (n in 1:N){
yrep[n] = gamma_rng( alpha, gamma_rng(mu,gamma));
}
}


I tested it with the following simulated data: rgamma(shape= 2 , rate = rgamma(n,shape= 10, rate = 100), n=10000)

How would you generally propose appropriate priors for this case?

With these priors
target += normal_lpdf(alpha| 2,0.5);
target += normal_lpdf(mu| 10,2);
target += normal_lpdf(gamma| 100,50);

But with these
target += normal_lpdf(alpha| 0,5);
target += normal_lpdf(mu| 0,25);
target += normal_lpdf(gamma| 0,250);

2 Likes

You have

which suggests youâ€™re trying to estimate N + 3 parameters from N observations. I tried letting \theta be a scalar, but got into serious sampling problems. This doesnâ€™t seem to be a very easy model to fit.

Yes. Iâ€™ll try to generate samples for lower values of \mu (the shape of the mixing) and see what happens.
Generally, are the realizations of \theta considered as actual parameters?

â€śIf you donâ€™t see it, itâ€™s a parameterâ€ť, Statistician, Bayesian. ;-)

2 Likes

I am trying this now, but the posteriors are strange. I may have some error in the formulae.

data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] y_i; //T
}
parameters {
real<lower=0> theta;
real<lower=0> alpha;
real<lower=0> gamma;
}
model {
target += normal_lpdf(alpha| 2,0.5);
target += normal_lpdf(gamma| 10,5);
target += normal_lpdf(theta| 100,30);

target += log(lbeta(alpha,gamma))-log(theta)+(alpha-1)*log(y_i/theta)-(alpha+gamma)*log(1+y_i/theta);

}
generated quantities{
//real yrep[N];
}


f_{X_i}(x)=\int_{\beta} f_{X_i}(x|\beta)f(\beta) d\beta\\ f_{X_i}(x)=\int_{\beta} \frac{\beta ^ \alpha}{\Gamma(\alpha)} x^{(\alpha -1)} e^{- \beta x} \frac{\theta ^ \gamma}{\Gamma(\gamma)} \beta ^{(\gamma -1)} e^{- \theta \beta} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1) \theta ^ \gamma}}{\Gamma(\alpha) \Gamma(\gamma)} \int_{\beta} \beta ^ \alpha e^{-\beta x} \beta ^{(\gamma -1)} e^{- \theta \beta} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1) \theta ^ \gamma}}{\Gamma(\alpha) \Gamma(\gamma)} \int_{\beta} \beta ^ {\alpha+\gamma -1}e^{-\beta (x+\theta)} d\beta\\ f_{X_i}(x)=\frac{x^{(\alpha -1)} \theta ^ \gamma}{\Gamma(\alpha) \Gamma(\gamma)} \frac{\Gamma(\alpha + \gamma)}{(x+\theta)^{\alpha + \gamma}} \int_{\beta} \beta ^ {\alpha+\gamma -1}e^{-\beta (x+\theta)} \frac{(x+\theta)^{\alpha + \gamma}}{\Gamma(\alpha + \gamma)} d\beta\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)} \theta ^ \gamma}{(x+\theta)^{\alpha + \gamma}}\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)} \theta ^ \gamma \theta ^ \alpha}{(x+\theta)^{\alpha + \gamma} \theta ^ \alpha}\\ f_{X_i}(x)=\frac{\Gamma(\alpha + \gamma)}{\Gamma(\alpha) \Gamma(\gamma)} \frac{x^{(\alpha -1)}}{\frac{(x+\theta)^{\alpha + \gamma} \theta ^ \alpha}{\theta ^ {\alpha + \gamma}}}
f_{X_i}(x)=\frac{\Gamma(\alpha+\gamma)}{\Gamma(\alpha)\Gamma(\gamma)} \frac{1}{\theta} \frac{(x/\theta)^{\alpha-1}}{(1+x/\theta)^{\alpha+\gamma}}
2 Likes

Shouldnâ€™t this be just lbeta(...)? You may also be missing a minus sign there.

Hello, I got back to this. I missed the minus, but isnâ€™t it log(lbeta)? Should I apply any kind of transformation to improve the sampling?

lbeta returns the natural logarithm of the beta function.

1 Like

I corrected those mistakes but still having problems with the sampling. Is it related to the tunning?

The log-increments are defined as

log(f(x_i)) = -log(B(\alpha,\gamma))+log(\theta)-(\alpha-1)\times log(x_i/\theta)+(\alpha+\gamma)\times log(1+x_i/\theta)
data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] X; //
}
parameters {
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> theta;
}
model {
target += normal_lpdf(alpha| 2,0.1);
target += normal_lpdf(gamma| 10,1);
target += normal_lpdf(theta| 100,5);

target += -lbeta(alpha,gamma)+log(theta)-(alpha-1)*log(X/theta)+(alpha+gamma)*log(1+X/theta);

}
generated quantities{
//real xrep[N];
}


There were 42 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmupThere were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-lowExamine the pairs() plot to diagnose sampling problems
The largest R-hat is 2.85, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hatBulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-essTail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

Thereâ€™s a conspicuous absence of a loop in the line incorporating the likelihood. Also, just in case these are caused by numerical issues, try this implementation:

for (i in 1:N)  target += -lbeta(alpha,gamma) + log(theta)-(alpha-1)*( log(X[i]) - log(theta)) + alpha + gamma)*log1p_exp(log(X[i]) - log(theta));


You might also move this line into its own function so you can re-use terms like log(theta) and log(X[i]).

I am not sure a loop is necessary, I was using a vectorized notation. Anyways, I tried that way also with the transformations you suggested, but didnâ€™t succeed.
Isnâ€™t it the shape of the beta function causing trouble in the likelihood?
This density is challengingâ€¦

I am using the fact that

X_i= \frac{\theta \alpha}{\gamma} \frac{G_{\alpha}^{(i)}}{G_{\gamma}}

where G_{\alpha} and G_{\gamma} are rvâ€™s gamma distributed with mean 1. Then X_i is an Extended Pareto as we defined before.
I am defining those two latent variables and the sampling works, but I still have problems with wide priors. This case seems to be similar to the mixture.

data {
int<lower=0> N;  //number of observations
vector<lower=0>[N] X; //
}
parameters {
real<lower=0> alpha;
real<lower=0> gamma;
real<lower=0> theta;
vector<lower=0>[N] GA;
}
transformed parameters{
real<lower=0> GG;
for (i in 1:N){
GG= (theta * alpha ) * (GA[i] / X[i]);
}
}
model {
target += normal_lpdf(alpha| 2,0.1);
target += normal_lpdf(gamma| 10,0.1);
target += normal_lpdf(theta| 100,5);

target += gamma_lpdf(GA|alpha,alpha);
target += gamma_lpdf(GG|gamma,1);
}
generated quantities{
//real xrep[N];
}