Divergences and bias in simple extreme-value model


I am working on a project that aims to quantify changes in the occurrence probabilities of climate extremes. To this end, we use a generalized extreme value (GEV) distribution with time-varying parameters to model the data. We have coded the model in C++ (not Stan) and use particle filtering to sample from the posterior. The model performs well. In the next phase of the project we want to expand the model and thought that using Stan would make our life much easier and so we decided to give it a shot. However, we have found that in the Stan model often different chains converge to very different values, some of which values are unphysical. At first, we blamed the lack of sufficient data (climate extremes are rare and so we don’t have a lot of data), even though the particle filter version of the model does not show these issues.

To shed some light on this, we played with a ‘toy’ model on synthetic data. In particular, we generated 500 samples from a GEV with constant predefined parameters (location, scale, and shape) and then tried to estimate those parameters by fitting a GEV using Stan. This is a very simple model and 500 values are a lot of data, so we were expecting to estimate the parameters with confidence. However, even in this simple case, the model still shows divergent chains and the chains that seem to converge to the right values produce, in fact, biased estimates of the parameters. Note that the same model written in PyMC3 gives the right value of the parameters. Estimate based on maximum likelihood estimation are also correct. I suspect that, being new with Stan, we have not written the model correctly, so if you could help me figure out what is wrong with the model and put me out of my misery I would be very grateful.

In this case, we set iter=3000, warmup=1000, chains =8. I attach a figure showing the 8 chains for ‘sigma’. I also attach the synthetic data (data.txt (8.3 KB)
). The true value of the parameters is mu=1.2, sigma=0.1, xi=0.05. Code for the model below.

Many thanks,

        // GEV log pdf
        real gev_lpdf(vector y,real mu,real sigma,real xi){
            vector[rows(y)] z;
            vector[rows(y)] logp;
            vector[rows(y)] zi;
            z = 1 + xi*(y - mu)/sigma;
            for(i in 1:rows(y)){
                zi[i] = z[i]^(-1/xi);
            logp = log(sigma) + (1 + 1/xi)*log(z) + zi;
            return -sum(logp); 
    data {

        int<lower=0> no;
        vector[no] y; //vector containing data
    parameters {
        real<lower=0> sigma;
        real xi;
        real<lower=0> mu;

    model {
        sigma ~ normal(0.1,1);
        xi ~ normal(0.05,1);
        mu ~ normal(1.2,1);
        y ~ gev(mu,sigma,xi);


1 Like

You mean like this?

        mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
sigma   0.09    0.00 0.00   0.09   0.09   0.09   0.10   0.10  2398    1
xi      0.05    0.00 0.03  -0.01   0.03   0.05   0.07   0.11  2403    1
mu      1.20    0.00 0.00   1.19   1.20   1.20   1.20   1.21  2992    1
lp__  385.33    0.03 1.20 382.22 384.74 385.64 386.23 386.74  1893    1

Had to call stan with init_r = 0.2 to prevent it from getting stuck -500 lp__ units below where it is supposed to be and diverging when it tries to escape.

1 Like

Thanks, setting init_r = 0.2 seems to resolve the issue. I suspected the problem was related to the initialization of the parameters and, in fact, I had tried to initialize the parameters manually, but whatever it was I did didn’t work.