Generalized Beta of the second kind (GB2)

I’m using Stan now for a bit more than a year, so please let me know if I’m trying to do something stupid here:

I want to estimate the parameters of a GB2 distribution that fit my data. I haven’t found a prebuild function in Stan like

normal_lpdf(y[n] | mu, sigma)

so I decided to built my own function. Starting from the GB2 function in wikipedia, I took the log and build my Stan model the following:

model_GB2 <-   '
functions{
    real gb2_lpdf(real y, real a, real b, real p, real q){
      return log(a)	+	(a*p-1)	*	log(y)	-	a*p	*	log(b)	-		beta_lpdf(y | p, q);
    }
  }
data{
  int<lower=0> N;
  vector[N] y;
}
parameters{
  real<lower=0> a;
  real<lower=0> b;
  real<lower=0> p;
  real<lower=0> q;
}
model{
  a ~ pareto(0,5);
  b ~ pareto(0,5);
  p ~ pareto(0,5);
  q ~ pareto(0,5);
  
  for (n in 1:N)
    target += gb2_lpdf(y[n]| a, b, p, q);
}

Running the model, Stan rejects the initial value as it encounters an error evaluating the log probability at the initial value. It also gets a random variable that is larger than 1 for the beta_lpdf part.

My assumption is that I should not be including the beta_lpdf part in the function, is that correct? (And if so, I’d appreciate learning why).
Is there a way to estimate GB2 parameters in Stan?

Are you sure y is in the correct range?

EDIT: corrected range.

I just double-checked, it is

OK. You might want to put some print(); statements in your program to try and work out where things are going awry.

One example is

for (n in 1:N)
     print(gb2_lpdf(y[n]| a, b, p, q));
    target += gb2_lpdf(y[n]| a, b, p, q);

so we know which density evaluations are going crazy.

1 Like

Thanks for the idea with stan print, it helped me debugging.

For future visitors of this post: The scale of the Pareto distribution must be strictly larger than zero. So this is what will work:

functions{
    real gb2_lpdf(real y, real a, real b, real p, real q){
      return log(a)	+	(a*p-1)	*	log(y)	-	a*p	*	log(b)	-		beta_lpdf(y | p, q);
    }
  }
data{
  int<lower=0> N;
  vector[N] y;
}
parameters{
  real<lower=0> a;
  real<lower=0> b;
  real<lower=0> p;
  real<lower=0> q;
}
model{
  a ~ pareto(0.001,5);
  b ~ pareto(0.001,5);
  p ~ pareto(0.001,5);
  q ~ pareto(0.001,5);
  
  for (n in 1:N)
    print(gb2_lpdf(y[N]| a, b, p, q));
    target += gb2_lpdf(y[N]| a, b, p, q);
}
1 Like

I wonder why the gb2_lpdf is not formulated as follows:

log(a) + (a * p - 1) * log(y) - a * p * log(b) - lbeta(p, q) - (p + q) * log (1 + (y / b)^a);

Thanks!