Rplot.pdf (5.3 KB)
Dear stan users:
I am currently constructing a 2-component time-varying Gaussian mixture model using Rstan,the full model is a bit complex and thus right now I just construct a smaller model to examine the model behaviour.
My model is like:
E(y_it) = \lambda_t * N(\beta_{t1}, \sigma) + (1-\lambda_t) * N( \beta_{t2}, \sigma) for all observations i =1: n ( in the real data, n=12 and there is some missing observations, for the simulation study n is set be 50 with the same missing patterns as in the real data)
and t is the time index, currently t=1:72.
the mixing weights and mixture locations are all time-varying with time-invariant and component-shared scale \sigma. Right now, I only focus on estimating \lambda_t and \sigma ( supplying “true” values of \beta_{t1} and \beta_{t2}, to ensure identifiability, all \beta_{t2} >\beta_{t1})
the mixing weights are believed to be slowly and smoothly changing over time and thus I use model it with a bspline with a RW(1) penalty prior ( seen post by http://mc-stan.org/users/documentation/case-studies/splines_in_stan.html).
Right now the model estimates sigma well. But in fact, the “true” \lambda_t is the 1-\lambda_t in the posterior samples. I do not understand why as I already supply the mixture locations \beta_{t1} and \beta_{t2} as data, why there is still label switching? \lambda_t is modelled by inverse logit of \eta_t where \eta_t is the product of the Bspline basis matrix  eta_bspline and the coefficients \beta_{\eta}. Can someone please provide some insights? The posterior mean of (1-\lambda_t) is attached, the red line is the simulated “true” values while the black line is the posterior means for  (1-\lambda_t)
The stan code is attached below
data {
  
  int<lower=1> N; // number of items that are defined
  
  int<lower=1> K; //number of columns==72 hours in a week 
  
  int<lower=1> D;// indicating number of days fitted
  //for simulation D=3, for real full 2-week data D=14
  
  int<lower=1> Ncol;
  
  int<lower=1,upper=Ncol> kk[N];//hour of week index 1:72
  
  vector[N] dat_complete;//defined LOG-TRANSFORMED observations
  
  matrix[K,2] ordered_overall_mean;
  
  int<lower=0> num_basis;
  
  matrix[K, (num_basis-1)] eta_bspline;
   
  
}
parameters {
  vector[(num_basis-1)] beta_eta_raw;
  
  real<lower=0> tau; \\RW penalty parameter for eta 
  
  real<lower=0> sigma;
  
}
transformed parameters{
  vector[(num_basis-1)] beta_eta;
  
  vector[K] eta;
  
  vector[K] lambda;
  
  beta_eta[1] =beta_eta_raw[1];
  
  for(i in 2:(num_basis-1)){
    beta_eta[i] = beta_eta[i-1] + beta_eta_raw[i] * tau;
  }
  
  eta = eta_bspline* beta_eta;
  
  lambda = inv_logit(eta);
  
}
model{
  
  beta_eta_raw~normal(0,1);
  
  tau~normal(0,1);
  
  sigma~cauchy(0,2.5);
 
  //likelihood  
  
   for(n in 1:N){
      
      
      target +=log_mix(lambda[kk[n]], normal_lpdf(dat_complete[n]| ordered_overall_mean[kk[n],1], sigma),
                     normal_lpdf(dat_complete[n]| ordered_overall_mean[kk[n],2], sigma));
  }
    
 
  
  }
Thank you for your feedback.