Initial value error (log(0)) for user defined beta prior distribution in hierarchical generalized extreme value (GEV) distribution

[edit: formatted Stan code]

I have a user defined Beta distribution.

Beta(\theta+0.5|p,q) = (0.5 + \theta)^{p-1} (0.5 - \theta)^{q-1} / B(p,q) between [-0.5,0.5]

where B(p,q) = \Gamma(p) \Gamma(q) / \Gamma(p+q)

real mbeta_lpdf(real theta, real p, real q) {
      real lpp;
      real b;
      lpp = pow(0.5 + theta,p-1) * pow(0.5 - theta,q-1);
      b = (tgamma(p-1)*tgamma(q-1))/tgamma(p+q-1);
      return log(lpp) - log(b);

This Beta distribution is implemented into a hierarchical generalized extreme value (GEV) distribution for the shape parameter (xi). For the location (mu) and scale (sigma) parameters I use a normal and gamma prior distributions respectively with uniform hyperpriors (alpha = uniform(-infinity,+infinity) and delta = uniform(0,+infinity).

This is the whole model:

functions {
    real gev_lpdf(vector y, real mu, real sigma, real xi) {
        int N = rows(y);
        vector[rows(y)] z;
        vector[rows(y)] lp;
        for(n in 1:N) {
            z[n] = 1 + (y[n] - mu) * xi / sigma;
            lp[n] = (1 + (1 / xi)) * log(z[n]) + pow(z[n], -1/xi);
        }        
        return -sum(lp) -  N*log(sigma);
    }
    
    real mbeta_lpdf(real theta, real p, real q) {
      real lpp;
      real b;
      lpp = pow(0.5 + theta,p-1) * pow(0.5 - theta,q-1);
      b = (tgamma(p-1)*tgamma(q-1))/tgamma(p+q-1);
      return log(lpp) - log(b);
    }
}
data {
  int<lower=0> N;    //observation and duration vector length
  int<lower=1> J;    //number of weather stations
  vector[N] y[J];      //data matrix
}
parameters {
  real<lower=-0.5,upper=0.5> xi[J];
  real mu[J];
  real<lower=0> sigma[J];
  real alpha;
  real<lower=0> delta;
}
model {
  for(j in 1:J){
    xi[j] ~ mbeta(6.0,9.0);
    mu[j] ~ normal(alpha,25);
    sigma[j] ~ gamma(delta,delta);
    y[j] ~ gev(mu[j],sigma[j],xi[j]);
    }
}

Here is my R code. I simulated some GEV samples using the revd function from the package extRemes:

library(rstan)

y = matrix(NA,nrow = 20, ncol = 41)

mu = runif(20,15,20)
sigma = runif(20,3,5)
xi = runif(20,-0.5,0.5)

library(extRemes)

for (i in 1:20) {
  y[i,] = revd(41,loc=mu[i],scale=sigma[i],shape=xi[i],type = "GEV")
}

N = 41
J = 20

dir = "~/Documents/MEGAshare/IDF_data_SG/BHM/Model_sensitivity/stackoverflow/"
model = stan_model(paste0(dir,"GEV_hier.stan"))

fit = sampling(model,list(N=N,D=D,J=J,y=y),iter=100,chains=1,seed=10)

When I run this code unfortunately it throws an error:

...
Chain 1: Rejecting initial value:
Chain 1:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

Where have I gone wrong here?

I’m not 100% sure what your problem is, but I’d start with a more arithmetically stable implementation of mbeta_lpdf. Both numerator and denominator may underflow as written. In general, try keeping everything on the log scale, so that’d be

real mbeta_lpdf(real theta, real p, real q) {
  real log_lpp = (p - 1) * log(0.5 + theta) + (q - 1) * log(0.5 - theta);
  real log_b = lgamma(p - 1) + lgamma(q - 1) - lgamma(p + q - 1);
  return log_lpp - log_b;
}

You can also vectorize gev_lpdf for efficiency,

real gev_lpdf(vector y, real mu, real sigma, real xi) {
  int N = rows(y);
  vector[N] z = 1 + (y - mu) * xi / sigma;
  vector[N] lp = (1 + 1 / xi) * log(z) + pow(z, -1/xi);
  return -sum(lp) -  N * log(sigma);
}

You can also vectorize the model code,

xi ~ mbeta(6, 9);
mu ~ normal(alpha, 25);
sigma ~ gamma(delta, delta);
for (j in 1:J) {
  y[j] ~ gev(mu[j], sigma[j], xi[j]);
}

You probably want at least weakly informative priors for alpha and delta, too.

In general, to debug, I’d suggest building the program up from simpler pieces. For example, try starting with just the prior on xi to make sure the problem’s not in the definition of mbeta. Then build something with just a single data point and the likelihood to make sure there’s not a problem with gev. And remember you can use print() statements to inspect the state of the code as it’s executing to see if things become NaN or infinite at some point (you can use target() to get the current state of the target density accumulator).

1 Like