GEV Not converging

I want to fit a generalised extreme value distribution (GEV), but cannot get it to converge, even with simulated data, with a model that estimates the mu, sigma, xi parameters.

functions{
        real gev_lpdf(vector y,real mu,real sigma,real xi){
            int N = num_elements(y);
            vector[N] t;
            real logp;
            vector[N] z;
            
            for(i in 1:N){
              z[i] = (y[i]-mu)/sigma;
              if(fabs(xi) > 0.0001) {
                  t[i] = pow(1 + xi*z[i],-1/xi);
              }
              else {
                t[i] = exp(-z[i]);
              }
            }
            
            logp = -N*log(sigma) + sum((xi+1)*log(t) - t);
            return logp;
        }
}

data {
    int<lower=0> n;
    vector[n] z;
}

parameters {
    real mu;
    real<lower=0> sigma;
    real xi;
}

model {
        sigma ~ cauchy(0,2) T[0,];
        xi ~  normal(0,5);
        mu ~ normal(0,5);

        z ~ gev(mu,sigma,xi);
}

Here is the call in RStudio

fitGEV <- stan (file = “gev-test.stan”, # Stan program
data = mydata, # named list of data
chains = 4, # number of Markov chains
iter = 8000, # total number of iterations per chain
cores = 4, # number of cores
init = list(list(mu=16, sigma=5.2, xi = -0.9),
list(mu=16, sigma=5.2, xi = -0.9),
list(mu=16, sigma=5.2, xi = -0.9),
list(mu=16, sigma=5.2, xi = -0.9)),
init_r = 0.2,
control = list(adapt_delta = 0.99,max_treedepth = 15))

The init values are the estimators for mu, sigma, xi from the Stata extreme package. (It makes no difference whether these are included or not).

I included input_r = 0.2 following an earlier post on a related topic.

I always get, whatever the number of iterations (even 16000):

  • 000s of divergent transitions
  • Bulk Effective Samples Size (ESS) is too low,
  • Tail Effective Samples Size (ESS) is too low.
  • very high R-hat values

Here is an example of the output:
Warning messages:
1: There were 10971 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: Examine the pairs() plot to diagnose sampling problems

3: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
4: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

  mean se_mean   sd  2.5% 97.5% n_eff Rhat

mu 14.73 0.04 0.71 13.23 15.98 328 1.02
sigma 5.92 0.04 0.67 4.73 7.34 244 1.02
xi -0.83 0.01 0.09 -1.00 -0.65 147 1.02

I have tried many different prior distributions - nothing seems to make any difference.

Data can easily be generated by
library(“evd”)
z <- rgev(75,16,5.2,-0.9)

I am using the latest version of rstan, on a MacBook Pro.

Any suggestions much appreciated.

thanks
Mel

I guess you need to constrain xi. See related Generalized Pareto distribution case study, which shows also how you can check your lpdf implementation.

1 Like

Thank you I will study this and report.
Mel

Thanks very much. It is the case that if xi is constrained then it all works fine.

Eg if xi < 0
real<lower = sigma/(mu-max_z)> xi;

where max_z is the maximum data value.
However, you seem to need to know in advance if xi < 0 or xi >0. If xi > 0 then we want

real<upper = sigma/(mu-min_z> xi;

I don’t know if there is a way to declare conditional constraints.