Hi! I am new to Stan and I am trying to perform a Bayesian mixture of GEV. However, the model seems to have a problem with convergence. Any help is appreciated!
gev_mix_model<-
"functions {
// GEV
real gev_lpdf(real y, real mu, real sigma, real xi) {
real z = 1 + xi * (y - mu) / sigma;
if (z <= 0)
return negative_infinity();
return -log(sigma) - (1 + 1/xi) * log(z) - pow(z, -1/xi);
}
}
data {
int<lower=1> N;
vector[N] y;
}
parameters {
simplex[3] theta;
ordered[3] mu;
vector<lower=0>[3] sigma;
vector[3] xi;
}
model {
mu ~ normal(0, 1000);
sigma ~ normal(0, 1000);
xi ~ normal(0, 10);
// Loglikelihood
for (n in 1:N) {
vector[3] lps;
for (k in 1:3) {
lps[k] = log(theta[k]) + gev_lpdf(y[n] | mu[k], sigma[k], xi[k]);
}
target += log_sum_exp(lps);
}
}
generated quantities {
matrix[N, 3] z_prob;
for (n in 1:N) {
vector[3] lps;
for (k in 1:3) {
lps[k] = log(theta[k]) + gev_lpdf(y[n] | mu[k], sigma[k], xi[k]);
}
z_prob[n] = softmax(lps)';
}
}"
Hi and welcome! A few things that can help folks answer your question: Can you post a snippet of data or some simulated data for folks to play with> What OS are you on? Which version of Stan and are you running it in R, python, something else? Can you post the error and any diagnostic plots that you’ve run.
Thanks!
Hi! I’m currently working with RStan via R 4.3.1 (rstan package is 2.32.2) on Windows 11. I am using this small set of data (
min_feb_23.csv (8.4 KB)) to try to estimate the mixture parameters, but the results I have reached are pretty poor, with no convergence at all (as the Rhat shows) :
Inference for Stan model: anon_model.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
theta[1] 0.46 0.26 0.37 0.01 0.42 1.00 2 2120.49
theta[2] 0.37 0.27 0.38 0.00 0.25 0.99 2 3710.48
theta[3] 0.17 0.17 0.23 0.00 0.05 0.57 2 1724.30
mu[1] -2.81 3.94 5.57 -12.18 -0.75 2.45 2 10866.31
mu[2] 3.58 1.75 2.47 0.19 3.65 6.77 2 40.07
mu[3] 523.83 386.08 546.23 1.30 388.28 1277.86 2 135.04
sigma[1] 17.41 19.15 27.08 0.15 2.61 64.24 2 64486.63
sigma[2] 10.17 11.52 16.32 0.24 1.02 40.32 2 25.27
sigma[3] 320.07 255.26 361.09 1.61 194.11 879.11 2 143.40
xi[1] -0.72 0.82 1.17 -2.10 -0.70 0.71 2 69.47
xi[2] -4.45 5.27 7.46 -17.47 -0.49 0.49 2 141.39
xi[3] -1.26 3.26 4.62 -8.50 0.00 3.60 2 159.55
Samples were drawn using NUTS(diag_e) at Sat Jul 26 18:59:38 2025.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
I have also tried to compile the .stan model file using cmdstanr package because I have read it is the fastest method and I wanted to change my a priori, but R gave me this warning:
Running MCMC with 1 chain...
Avvertimento: Chain 1 finished unexpectedly!
Messaggio di avvertimento:
No chains finished successfully. Unable to retrieve the fit
I have normalized the data to verify if it was some kind of problem related to the value of z (being <0), but I still got the same warning repeatedly, and now I’m kind of stuck.
Thank you so much for your kind answer anyway!
2 Likes