Problematic posterior fit for 2-component Gaussian mixture model

jaslenelin · August 4, 2019, 11:46pm

Dear Stan users:

I am currently fitting a single component and a 2-component Gaussian model to a real data set and using WAIC to select the preferred model.

I simulated a 2-component Gaussian model (having the data structure exactly like my real data to be modelled) and fit both correct 2-component and incorrect 1-component model using a customised-constructed loglikelihood function and my model is able to pick 2-component as having a low WAIC value and also both models converged (no divergent warnings, Rhat=1).

However, when I fit my model to the real data set ( which is suspected to have 2 components), I cannot get my model converged not matter how long I ran.

Actually, my customised likelihood function model is only to fit on group-level summary statistics of the data set (using 5-order summary stats) rather than all of the underlying data points and to verify I also just fit the same model using all of my classical observations and I can get a nice result and WAIC (for fitting all of the classical data points rather than just use 5-order statistics) also picked the 2-component model to be superior than the 1-component model.

I do not understand why is this the case. Can anyone please provide some suggestions as how I can figure out where it goes wrong? ( in the sense that why it works for the simulated data but it does not work for the real data and I cannot get the model fitted on the real data to converge)
Please share your Stan program and accompanying data if possible.

Below is my model code for the customised-constructed likelihood function where it uses only 5-order summary statics for a 3-hour block rather than using all observations within this 3-hour block

//For QUT data with 3-hour block interacting with day of the week and day of the year effects
//Fitting with the whole 17+24 days of data
functions{
  
  int r_in(int pos,int[] pos_var) {
    int pos_match;
    int all_matches[size(pos_var)];
    
    for (p in 1:(size(pos_var))) {
      all_matches[p] = (pos_var[p]==pos);
    }
    
    if(sum(all_matches)>0) {
      pos_match = 1;
      return pos_match;
    } else {
      pos_match = 0;
      return pos_match;
    }
    
  }
  
  real varying_lpdf(real [] y, int N, int[] kk, int[] index1, int[] index2, int[] index_11, int[] index_12, int[] index_18, int[] index_25, int[] index_26, int[] hour_index, int Ncol, int Ncol_complete, 
  int index_length, matrix ordered_effects_gen, real sigma, vector lambda_gen){
    
    
    vector[N] cdf;
    vector[N] sum_pdf;
    vector[index_length] diff;
    vector[Ncol] log_prob;
    vector[Ncol] col_sum_pdf;
  
    log_prob=rep_vector( 0, Ncol);
    col_sum_pdf = rep_vector(0, Ncol);
    
     for(n in 1:N){
      cdf[n] =  (1-lambda_gen[kk[n]]) * normal_cdf(y[n], ordered_effects_gen[kk[n],1], sigma) +lambda_gen[kk[n]] *normal_cdf(y[n],ordered_effects_gen[kk[n],2], sigma) ;
      sum_pdf[n] =   log_mix((1-lambda_gen[kk[n]]), normal_lpdf(y[n]| ordered_effects_gen[kk[n],1], sigma), normal_lpdf(y[n]| ordered_effects_gen[kk[n],2], sigma));
       
    }
    
    for(n in 1:index_length){


diff[n] = (cdf[index2[n]] - cdf[index1[n]]);

}

    
  for(n in 1:Ncol_complete){
  
  col_sum_pdf[hour_index[n]] =  sum_pdf[5*n-4] + sum_pdf[5*n-3] +sum_pdf[5*n-1] +sum_pdf[5*n-1]+sum_pdf[5*n];
  
   
    
   //for 3-hour blocks where there are 18 obs 
  if(r_in(hour_index[n], index_18)){
   log_prob[hour_index[n]] = 5 * log( diff[4*n-3] ) + 2*log(diff[4*n-2]) + log(diff[4*n-1]) + 5* log(diff[4*n]);
  }
  
  if(r_in(hour_index[n], index_25)){
   log_prob[hour_index[n]] = 5 * log( diff[4*n-3] ) + 5*log(diff[4*n-2]) + 5* log(diff[4*n-1]) + 5* log(diff[4*n]);
  }
  
  
  if(r_in(hour_index[n], index_26)){
   log_prob[hour_index[n]] = 5 * log( diff[4*n-3] ) + 2*log(diff[4*n-2]) + log(diff[4*n-1]) + 6* log(diff[4*n]);
  }
  
  if(r_in(hour_index[n], index_11)){
   log_prob[hour_index[n]] = 2* log( diff[4*n-3] ) + 2*log(diff[4*n-2]) + log(diff[4*n-1]) + log(diff[4*n]);
  }
  
  if(r_in(hour_index[n], index_12)){
   log_prob[hour_index[n]] = 2 * log( diff[4*n-3] ) + 2*log(diff[4*n-2]) + log(diff[4*n-1]) + 2* log(diff[4*n]);
  }
  
  
  
  
  
  }
    
    return(sum(log_prob)+sum(col_sum_pdf));
  }
  
  vector log_lik_vec_gen( int Ncol, int Ncol_complete, int[] index_11, int[] index_12, int[] index_18, int[] index_25, int[] index_26, int[] hour_index, vector diff_gen, vector sum_pdf_gen ){
    vector[Ncol] logg_prob1;
    vector[Ncol] col_sum_pdf_gen;
    
    
      logg_prob1=rep_vector( 0, Ncol);
      col_sum_pdf_gen = rep_vector( 0, Ncol);
      
  
  
for(n in 1:Ncol_complete){
   col_sum_pdf_gen[hour_index[n]] =  sum_pdf_gen[5*n-4] + sum_pdf_gen[5*n-3] +sum_pdf_gen[5*n-1] +sum_pdf_gen[5*n-1]+sum_pdf_gen[5*n];
 
   if(r_in(hour_index[n], index_18)){
     logg_prob1[hour_index[n]] = 5 * log( diff_gen[4*n-3] ) + 2*log(diff_gen[4*n-2]) + log(diff_gen[4*n-1]) + 5* log(diff_gen[4*n]);
   }
   
   if(r_in(hour_index[n], index_25)){
     logg_prob1[hour_index[n]] = 5 * log( diff_gen[4*n-3] ) + 5*log(diff_gen[4*n-2]) + 5* log(diff_gen[4*n-1]) + 5* log(diff_gen[4*n]);
   }
   
   
   if(r_in(hour_index[n], index_26)){
     logg_prob1[hour_index[n]] = 5 * log( diff_gen[4*n-3] ) + 2*log(diff_gen[4*n-2]) + log(diff_gen[4*n-1]) + 6* log(diff_gen[4*n]);
   }
   
   if(r_in(hour_index[n], index_11)){
     logg_prob1[hour_index[n]] = 2* log( diff_gen[4*n-3] ) + 2*log(diff_gen[4*n-2]) + log(diff_gen[4*n-1]) + log(diff_gen[4*n]);
   }
   
   if(r_in(hour_index[n], index_12)){
     logg_prob1[hour_index[n]] = 2 * log( diff_gen[4*n-3] ) + 2*log(diff_gen[4*n-2]) + log(diff_gen[4*n-1]) + 2* log(diff_gen[4*n]);
   }
   
}
 
 
    
     
return(logg_prob1+col_sum_pdf_gen);
  }
}

data {
  int<lower=1> N; //==1660 fully observed observations
   
   int<lower=1> Ncol; //==352 total 3-hour blocks to be estimated
   
   int<lower=1> Ncol_complete; //332 completely observed 3-hour blocks
   
   int<lower=1> index_length; //1328
   
   int<lower=1, upper=352> kk[N];//row index for fully observed observations
   
   real<lower=0> y[N];
  
   int<lower=0>  index1[index_length];
   
   int<lower=0>  index2[index_length];
   
   int<lower=0>  index_11[1];
   
   int<lower=0>  index_12[1];
   
   int<lower=0>  index_18[143];
   
   int<lower=0>  index_25[56];
   
   int<lower=0>  index_26[131];
   
   
   int<lower=0>  hour_index[Ncol_complete];
   int<lower=1> D;// indicating number of days fitted
   
   int<lower=1> K;// indicating hour of weeks 56
   
   matrix[K, K] K_TD; //penalty prior precision matrix for hour of the week coefficients
   
   int num_basis;
   
   matrix[D, num_basis] B_annual;
   
   matrix[K, (num_basis-1)] eta_bspline;
  
}

parameters {
  
  vector[num_basis] beta_annual_raw;//bspline coefficients for annual effects
  
  real<lower=0> sigma;

                                           
  vector[(num_basis-1)] beta_eta_raw;
  
  real mu_beta_eta;
  
  real<lower=0> sigma_beta_eta;
  //real<lower=0> tau;//RW penalty scale for beta_eta
 
  ordered[2] beta[K];// ordered hour of week effects: a matrix of 56 * 2 where 2nd col
  //is greater than 1st col for all rows.

}


transformed parameters {
  vector[(num_basis-1)] beta_eta;
  
  
  
  vector[2] alpha;//intercept of hour of week
  
  matrix[K,2] beta_adj;//sum to zero hour of week component mean
  
  vector[num_basis] beta_annual;//coefficients for annual effects
  
  
  beta_annual = beta_annual_raw*sigma;
  
  beta_eta = mu_beta_eta + sigma_beta_eta * beta_eta_raw;
  
  //beta_eta[1] =beta_eta_raw[1];
  
  //for(i in 2:(num_basis-1)){
    //beta_eta[i] = beta_eta[i-1] + beta_eta_raw[i] * tau;
  //}
  
  
  

  
  alpha[1] = mean(beta[,1]);
  alpha[2] = mean(beta[,2]);
  for(t in 1:K){
  beta_adj[t,1] = beta[t,1] - alpha[1];
  beta_adj[t,2] = beta[t,2] - alpha[2];

  }  
  
  beta_annual = beta_annual_raw* sigma;
  
  
 
}


model{
  vector[Ncol] annual_effects_gen;
  
  matrix[Ncol,2] ordered_effects_gen;
  
  
  vector[K] lambda;
  
  vector[Ncol] lambda_gen;
   
  beta_annual_raw~ normal(0, 1);//prior for the slopes following Gelman 2008
  
  sigma ~cauchy(0,2.5);
  
  beta_eta_raw~std_normal();
  
  sigma_beta_eta~std_normal();
  
  mu_beta_eta~std_normal();
  
  
  lambda = inv_logit(eta_bspline* beta_eta);
  
  for( t in 1:6){
    lambda_gen[(1+(t-1)*56): (56*t) ] = lambda;
  }
  
  for(t in 337: Ncol){
    lambda_gen[t] = lambda[t-(K*6)];
  
  }
  
  
   for(t in 1:D){
    annual_effects_gen[(1+(t-1)*8) : (8*t)] = rep_vector( (B_annual * beta_annual)[t], 8);
  }
  
  
  //44-day there are whole 6 weeks plus 2 days (16 3-hour blocks)
  //week 1-6 
  for (t in 1:Ncol){
    if(t <=K){
    ordered_effects_gen[t,1] = alpha[1] + beta_adj[t,1] + annual_effects_gen[t];
    ordered_effects_gen[t,2] = alpha[2] + beta_adj[t,2] + annual_effects_gen[t];
  }
    if(t >=(K+1) && t<= (K*2))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-K,1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-K,2] + annual_effects_gen[t];
    }
    
    if(t >=(2*K+1) && t<= (K*3))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(2*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(2*K),2] + annual_effects_gen[t];
    }
    
    
    if(t >=(3*K+1) && t<= (K*4))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(3*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(3*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(4*K+1) && t<= (K*5))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(4*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(4*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(5*K+1) && t<= (K*6))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(5*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(5*K),2] + annual_effects_gen[t];
    }
    
  }
  
  //remaining 16 of 3-hour blocks
  for(t in 337:Ncol){
    ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(K*6),1] + annual_effects_gen[t];
    ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(K*6),2] + annual_effects_gen[t];
  }
 
 
  
  
  target += multi_normal_prec_lpdf (beta_adj[,1]| rep_vector(0, K),( 1/square(sigma) * K_TD));
  
  target += multi_normal_prec_lpdf (beta_adj[,2]| rep_vector(0, K),( 1/square(sigma) * K_TD));
  
  //likelihood  
  
  
   target += varying_lpdf( y| N,  kk,  index1,  index2, index_11,  index_12,  index_18,  index_25, index_26, hour_index,  Ncol, Ncol_complete,  index_length, ordered_effects_gen, sigma, lambda_gen);
    
 
  
  
}

generated quantities{
  
  
  vector[Ncol] annual_effects_gen;
  
  matrix[Ncol,2] ordered_effects_gen;
  
  vector[K] lambda;
  
  vector[Ncol] lambda_gen;
  
  vector[Ncol] log_lik_vec;//loglikelihood per time period
  
  vector[N] cdf1;
  
  vector[N] sum_pdf_gen;
  
  vector[index_length] diff_gen;
  
  
  lambda = inv_logit(eta_bspline* beta_eta);
  
  for( t in 1:6){
    lambda_gen[(1+(t-1)*56): (56*t) ] = lambda;
  }
  
  for(t in 337: Ncol){
    lambda_gen[t] = lambda[t-(K*6)];
  
  }
  
  
   for(t in 1:D){
    annual_effects_gen[(1+(t-1)*8) : (8*t)] = rep_vector( (B_annual * beta_annual)[t], 8);
  }
  
  
  
  
  //44-day there are whole 6 weeks plus 2 days (16 3-hour blocks)
  //week 1-6 
  for (t in 1:Ncol){
    if(t <=K){
    ordered_effects_gen[t,1] = alpha[1] + beta_adj[t,1] + annual_effects_gen[t];
    ordered_effects_gen[t,2] = alpha[2] + beta_adj[t,2] + annual_effects_gen[t];
  }
    if(t >=(K+1) && t<= (K*2))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-K,1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-K,2] + annual_effects_gen[t];
    }
    
    if(t >=(2*K+1) && t<= (K*3))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(2*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(2*K),2] + annual_effects_gen[t];
    }
    
    
    if(t >=(3*K+1) && t<= (K*4))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(3*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(3*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(4*K+1) && t<= (K*5))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(4*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(4*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(5*K+1) && t<= (K*6))
    {
      ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(5*K),1] + annual_effects_gen[t];
      ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(5*K),2] + annual_effects_gen[t];
    }
    
  }
  
  //remaining 16 of 3-hour blocks
  for(t in 337:Ncol){
    ordered_effects_gen[t,1] = alpha[1] + beta_adj[t-(K*6),1] + annual_effects_gen[t];
    ordered_effects_gen[t,2] = alpha[2] + beta_adj[t-(K*6),2] + annual_effects_gen[t];
  }
  
   for(n in 1:N){
      cdf1[n] =  (1-lambda_gen[kk[n]]) *normal_cdf(y[n], ordered_effects_gen[kk[n],1], sigma) + lambda_gen[kk[n]]*normal_cdf(y[n], ordered_effects_gen[kk[n],2], sigma);
      sum_pdf_gen[n] =  log_mix((1-lambda_gen[kk[n]]),normal_lpdf(y[n] | ordered_effects_gen[kk[n],1], sigma),normal_lpdf(y[n] | ordered_effects_gen[kk[n],2], sigma));
    }
    
    for(n in 1:index_length){


diff_gen[n] = (cdf1[index2[n]] - cdf1[index1[n]]);

}
  
  

  log_lik_vec = log_lik_vec_gen( Ncol, Ncol_complete, index_11, index_12, index_18, index_25,  index_26, hour_index,  diff_gen, sum_pdf_gen);
  
  
}






Here is the 2-component model for using all the of underlying classical data points 
```stan
//For QUT data with 3-hour block interacting with day of the week and day of the year effects
//Fitting with the whole 20+24 days of data
data {
  int<lower=1> N; // Total number of observations = 7373
  
  int<lower=1> K; // Index for 3-hour (8*7) day interaction ==56 per week
  
  int<lower=1> D;// indicating number of days fitted D=20+24 days for overall dataset
  
  int<lower=1> Ncol;//==352, 160 for first 20-day while 192 for 24-day
  
  int<lower=1,upper=Ncol> kk[N];//Hour of the day indicator for individual obs 
  
  vector[N] dat_complete;//defined LOG-TRANSFORMED observations
  
  int num_basis;
  
  matrix[D, num_basis] B_annual; //sum to zero cyclic bspline
  
  matrix[K, K] K_TD; //penalty prior precision matrix for hour of the week coefficients
  
  matrix[K, (num_basis-1)] eta_bspline;
  
}


parameters {
  
  vector[(num_basis-1)] beta_eta_raw;
  
  real<lower=0> tau;//RW penalty scale for beta_eta
 
  ordered[2] beta[K];// ordered hour of week effects: a matrix of 56 * 2 where 2nd col
  //is greater than 1st col for all rows.
  
  vector[num_basis] beta_annual_raw;//bspline coefficients for annual effects
  
  real<lower=0> sigma;

  
}

transformed parameters {
  vector[(num_basis-1)] beta_eta;
  
  vector[K] eta;
  
  
  vector[2] alpha;//intercept of hour of week
  
  matrix[K,2] beta_adj;//sum to zero hour of week component mean
  
  vector[num_basis] beta_annual;//coefficients for annual effects
  
  beta_annual = beta_annual_raw*sigma;
  
  beta_eta[1] =beta_eta_raw[1];
  
  for(i in 2:(num_basis-1)){
    beta_eta[i] = beta_eta[i-1] + beta_eta_raw[i] * tau;
  }
  
  eta = eta_bspline* beta_eta;
  

  alpha[1] = mean(beta[,1]);
  alpha[2] = mean(beta[,2]);
  
  for(t in 1:K){
  beta_adj[t,1] = beta[t,1] - alpha[1];
  beta_adj[t,2] = beta[t,2] - alpha[2];

  }  
  
}



model{
 
  vector[Ncol] annual_effects_gen;
  
  matrix[Ncol,2] ordered_overall_mean;
  
  vector[K] lambda;
  
  vector[Ncol] lambda_gen;
  
  beta_annual_raw~ normal(0, 1);//prior for the slopes following Gelman 2008
  
  sigma ~cauchy(0,2.5);
  
  beta_eta_raw~normal(0,1);
  
  tau~normal(0,1);
 
  
  target += multi_normal_prec_lpdf (beta_adj[,1]| rep_vector(0, K),( 1/square(sigma) * K_TD));
  
  target += multi_normal_prec_lpdf (beta_adj[,2]| rep_vector(0, K),( 1/square(sigma) * K_TD));
  
  lambda = inv_logit(eta);
  
  for( t in 1:6){
    lambda_gen[(1+(t-1)*56): (56*t) ] = lambda;
  }
  
  for(t in 337: Ncol){
    lambda_gen[t] = lambda[t-(K*6)];
  
  }
  
  
for(t in 1:D){
    annual_effects_gen[(1+(t-1)*8) : (8*t)] = rep_vector( (B_annual * beta_annual)[t], 8);
  }
  
 
  
  //44-day there are whole 6 weeks plus 2 days (16 3-hour blocks)
  //week 1-6 
  for (t in 1:Ncol){
    if(t <=K){
    ordered_overall_mean[t,1] = alpha[1] + beta_adj[t,1] + annual_effects_gen[t];
    ordered_overall_mean[t,2] = alpha[2] + beta_adj[t,2] + annual_effects_gen[t];
  }
    if(t >=(K+1) && t<= (K*2))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-K,1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-K,2] + annual_effects_gen[t];
    }
    
    if(t >=(2*K+1) && t<= (K*3))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(2*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(2*K),2] + annual_effects_gen[t];
    }
    
    
    if(t >=(3*K+1) && t<= (K*4))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(3*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(3*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(4*K+1) && t<= (K*5))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(4*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(4*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(5*K+1) && t<= (K*6))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(5*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(5*K),2] + annual_effects_gen[t];
    }
    
  }
  
  //remaining 16 of 3-hour blocks
  for(t in 337:Ncol){
    ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(K*6),1] + annual_effects_gen[t];
    ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(K*6),2] + annual_effects_gen[t];
  }
 
 
  
  //likelihood  
  
   for(n in 1:N){
      
      
      target +=log_mix((1-lambda_gen[kk[n]]), normal_lpdf(dat_complete[n]| ordered_overall_mean[kk[n],1], sigma),
                       normal_lpdf(dat_complete[n]| ordered_overall_mean[kk[n],2], sigma));
    }
    
  
  }


generated quantities{
  vector[N] log_lik_vec;//loglikelihood per time period
  
  vector[Ncol] annual_effects_gen;
  
  matrix[Ncol,2] ordered_overall_mean;
  
  vector[K] lambda;
  
  vector[Ncol] lambda_gen;
  
  lambda = inv_logit(eta);
  
  for( t in 1:6){
    lambda_gen[(1+(t-1)*56): (56*t) ] = lambda;
  }
  
  for(t in 337: Ncol){
    lambda_gen[t] = lambda[t-(K*6)];
  
  }
  
  
for(t in 1:D){
    annual_effects_gen[(1+(t-1)*8) : (8*t)] = rep_vector( (B_annual * beta_annual)[t], 8);
  }
  
 
  
  //44-day there are whole 6 weeks plus 2 days (16 3-hour blocks)
  //week 1-6 
  for (t in 1:Ncol){
    if(t <=K){
    ordered_overall_mean[t,1] = alpha[1] + beta_adj[t,1] + annual_effects_gen[t];
    ordered_overall_mean[t,2] = alpha[2] + beta_adj[t,2] + annual_effects_gen[t];
  }
    if(t >=(K+1) && t<= (K*2))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-K,1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-K,2] + annual_effects_gen[t];
    }
    
    if(t >=(2*K+1) && t<= (K*3))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(2*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(2*K),2] + annual_effects_gen[t];
    }
    
    
    if(t >=(3*K+1) && t<= (K*4))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(3*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(3*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(4*K+1) && t<= (K*5))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(4*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(4*K),2] + annual_effects_gen[t];
    }
    
    if(t >=(5*K+1) && t<= (K*6))
    {
      ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(5*K),1] + annual_effects_gen[t];
      ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(5*K),2] + annual_effects_gen[t];
    }
    
  }
  
  //remaining 16 of 3-hour blocks
  for(t in 337:Ncol){
    ordered_overall_mean[t,1] = alpha[1] + beta_adj[t-(K*6),1] + annual_effects_gen[t];
    ordered_overall_mean[t,2] = alpha[2] + beta_adj[t-(K*6),2] + annual_effects_gen[t];
  }
 
 
  
  
  
 for(n in 1:N){
      
      log_lik_vec[n] =log_mix((1-lambda_gen[kk[n]]), normal_lpdf(dat_complete[n]|ordered_overall_mean[kk[n],1], sigma),
                                 normal_lpdf(dat_complete[n]|ordered_overall_mean[kk[n],2], sigma));
      
  }
}

Rplot.pdf (1.0 MB)

The first graph is the traceplot for mixture location 1, on the left is fitted with the customised-constructed likelihood function for real data, while the right is fitted with all classical observations for the real data.

Rplot01.pdf (1.1 MB)
The second plot is the same but fitted with simulated 2-component data. It can be seen that the customised-likelihood function model mimcs the classical likelihood function model results very well and it also converged.

Any insights will be greatly appreciated.

martinmodrak · August 10, 2019, 10:04am

My best guess is that the model actually is a bad fit to the real data, i.e. either you have a bug in the model or the data are actually different than what you assume (or both).

I don’t understand why you would want to fit to summary stats only, but I think this strengthens the evidence that the way you coded your model has a bug/math mistake. Unfortunately the code you posted is quite long and I don’t have capacity to try to understand it and figure out what is wrong.

I would first focus on double checking the code does what you think it does. If it converges with simulated data, does it converge on correct values? If it converges on simulated data but not on the real dataset, the real dataset must be somewhat different from the simulated data. Try finding ways to simulate data that are closer to “real” and see if your model stops converging.

After that, my best advice would be to do posterior predictive checks (see manual and this forum if this is not familiar) on the non-converged model to see where its implications differ from the actual data.

Hope that helps.

Topic		Replies	Views
Bayesian Mixture Model Modeling pystan , rstan , fitting-issues , stan , mixture	2	584	December 29, 2022
Multivariate Gaussian mixture model help Modeling fitting-issues , specification	17	2761	April 14, 2020
Non-linear least squares fitting Issue Modeling fitting-issues	9	929	May 21, 2021
Mixture Models for sets of observations Modeling	3	654	May 8, 2020
Beta binomial mixture model Modeling mixture	5	1311	March 4, 2021

Problematic posterior fit for 2-component Gaussian mixture model

Related topics