Bayesian Mixture Model

I’m newbie using STAN. I’m trying to perform a Bayesian mixture model (mixing Gaussian and GEV distributions). However, the model has convergence problem. Any comments on improving it.Thanks!

functions{
     
  real gev_lpdf(vector y, real loc_gev, real scale_gev, real shape_gev) {
    vector[rows(y)] z;
    vector[rows(y)] lp;
    real loglike;
    int N;
         
    N = rows(y);
         
    if (fabs(shape_gev) > 1e-10) {
      for(i in 1:N){
        z[i] = 1 + shape_gev * (y[i] - loc_gev) / scale_gev;
        lp[i] = (1 + 1 / shape_gev) * log(z[i]) + pow(z[i], -1 / shape_gev);
      }
      loglike = - sum(lp) - N * log(scale_gev);
    } else {
      for (i in 1:N){
        z[i] = (y[i] - loc_gev) / scale_gev;
        lp[i] = z[i] + exp(-z[i]);
      }
      loglike = - sum(lp) - N * log(scale_gev);
    }
         
    return(loglike);
  }
}

data {
  int<lower = 0> N;
  vector[N] y;
}

transformed data {
  real miny;
  real maxy;
     
  miny = min(y);
  maxy = max(y);
}

parameters {
  real<lower=-0.5,upper=0.5> shape_gev;
  real<lower=0> scale_gev;
  real<lower= (shape_gev > 0 ? miny : negative_infinity()),
      upper= (shape_gev > 0 ? positive_infinity() : maxy) > loc_gev;
       
  real<lower=0, upper=0.5> loc_nor;
  real<lower=0 > scale_nor;
  real<lower=0, upper=1> theta;
}

model {
  shape_gev ~ normal(-0.5, 0.5);
  loc_gev ~ normal(0.06, 10);
  scale_gev ~ normal(0.09, 10);
  loc_nor  ~ normal(0.23, 25);
  scale_nor ~ normal(0.03, 10);
  theta ~ uniform(0,1);
 
  target += log_mix(theta,
                    normal_lpdf(y | loc_nor, scale_nor),
                    gev_lpdf(y | loc_gev,scale_gev,shape_gev));
}

this model assumes that the entire vector of y is drawn entirely from either a normal or GEV distribution - that is not what the log_mix function wants.
see 5.4 Vectorizing mixtures | Stan User’s Guide

since you’re defining function gev_lpdf, you should make sure it’s a properly normalized density. furthermore, the function definition raises some warning flags in that you have branching logic - keeping things away from zero? and the else branch doesn’t do anything with shape_gev. I don’t know enough about GEV distributions to appreciate your code, but you probably shouldn’t be trying to roll your own here. if BRMS has this available, you might try seeing what it generates and take it from there.

1 Like

Thanks in advance for your help @mitzimorris. I will carefully check these aspects then.