Generated quantities from stick-breaking mixtures


I’m writing a mixture model that generates its samples via the stick-breaking process. I’m trying to recover any distinct mixtures of skin conductance response data recorded from a large sample of people from a diverse population. I adopted the stick-breaking code from this post (specifically the code that @Bob_Carpenter posted).

I’m finding very odd results when trying to generate the mixture assignments. It could be my lack of better understanding, but hopefully someone can clear any misunderstandings. When I use the below code to simulate the model, I obtain identical mixture assignments to every observation when taking the mode across all chain samples.

data {
  int<lower=0> K;  // Number of cluster
  int<lower=0> N;  // Number of observations
  real y[N];  // observations
}

parameters {
  vector<lower=0,upper=1>[K - 1] v;  // stickbreak components
  real<lower=0, upper=50> lambda[K]; // exponential parameter
  real<lower=0, upper=15> alpha;  // hyper prior DP(alpha, base)
}

transformed parameters {
  // stick-breaking process
  simplex[K] eta;
 
  real sum = 0;
  eta[1] = v[1];
  sum = eta[1];
  for (k in 2:(K - 1)) {
     eta[k] = (1 - sum) * v[k];
     sum += eta[k];
  }
  eta[K] = 1 - sum;

}

model {
  real ps[K];
  alpha ~ gamma(3, 0.5); 
  v ~ beta(1, alpha);
  

  for(i in 1:N){

    for(k in 1:K){
      ps[k] = log(eta[k]) + exponential_lpdf(y[i] | lambda[k]);
    }
    target += log_sum_exp(ps);
  }
}

generated quantities{
  vector[N] log_lik; // log-likelihood for calculation of LOO
  int<lower=1> Z[N]; // group index

  for (n in 1:N) {
    vector[K] ll;
    
    for(k in 1:K) {
      ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);     
    }
    
    log_lik[n] = log_sum_exp(ll);
    Z[n] = categorical_rng(exp(ll - log_lik[n]));
  }
}

Running the model in R as thus


load('SCR_array.RData')
SCR <- SCR_array
n <- length(SCR)

rstan_options(auto_write = TRUE)
options(mc.cores = 2)

dataList <- list(K=4, N=n, y=SCR)

modelFile <- 'mixture_SCR_model.stan'
nIter     <- 6000
nChains   <- 4
nWarmup   <- floor(nIter/2)
nThin     <- 1

fit_reg <- stan(modelFile,
                  data    = dataList,
                  chains  = nChains,
                  iter    = nIter,
                  warmup  = nWarmup,
                  thin    = nThin,
                  init    = "random",
                  seed    = 1450154626)

When I try to extract the mode of the mixture assignments (I adopted this post’s generated block), I find the below oddity. Sometimes it’s 3, sometimes it’s 4, but all times its delta-like stick.

> fit_ppd <- extract(fit_reg)
> apply(fit_ppd$Z, 2, stat.mode)
 [1] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 [47] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 [93] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[139] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[185] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[231] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[277] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[323] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

Elsewhere on the forum, someone has suggested using this method to generate the mixture assignments.

generated quantities{
  vector[N] log_lik; // log-likelihood for calculation of LOO
  int<lower=1> Z[N]; // group index

  for (n in 1:N) {
    vector[K] ll;
    
    for(k in 1:K) {
      ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);     
    }
    
    log_lik[n] = log_sum_exp(ll);
    Z[n] = categorical_logit_rng(ll); // only line that differs from the above ex
  }
}

This obtains a nearly identical result as the above. Any other posts about generated mixture assignments from mixture models use nearly identical code to do so. So, I suspect both codes are the same underneath the hood, relating to where the logit function is placed?

There’s some difference in the model blocks between my own and all the examples I found, but nothing substantive. However, what I noticed is that none of them used a stick-breaking process explicitly in the transformed parameter block. So, I examined if removing that component from the model would change the outcome and as expected it did (The first post I mentioned said that simplex utilizes stick-breaking, so it seemed reasonable enough although using the dirichlet with this doesn’t change the outcome from what I’m finding). Here’s the code for that model below.

data {
  int<lower=0> K;  // Number of cluster
  int<lower=0> N;  // Number of observations
  real y[N];  // observations
}

parameters {
  simplex[K] eta;
  real<lower=0, upper=50> lambda[K]; // exponential parameter
}

model {
  real ps[K];
// placing a dirichlet here wouldn't change the results, but one can check
  
  for(i in 1:N){

    for(k in 1:K){
      ps[k] = log(eta[k]) + exponential_lpdf(y[i] | lambda[k]);
    }
    target += log_sum_exp(ps);
  }
}

generated quantities{
  vector[N] log_lik; // log-likelihood for calculation of LOO
  int<lower=1> Z[N]; // group index

  for (n in 1:N) {
    vector[K] ll;
    
    for(k in 1:K) {
      ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);     
    }
    
    log_lik[n] = log_sum_exp(ll);
    Z[n] = categorical_rng(exp(ll - log_lik[n]));
  }
}
[1] 2 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 4 1 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 3 3 3
 [47] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 4 3 3 3 3 3 4 3 3 3
 [93] 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3
[139] 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 4 4
[185] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 3 3 4
[231] 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3
[277] 3 3 3 3 3 3 1 3 3 3 3 3 3 4 3 4 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3
[323] 3 3 3 3 3 3 3 3 4 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

In fact, they show more varied generation as this a separate attempt using the above stan code.

  [1] 4 3 3 3 3 4 1 3 1 3 1 2 2 3 1 1 1 3 1 3 4 1 1 3 3 3 1 4 4 3 3 3 1 1 1 4 2 1 4 3 4 1 4 2 1 3
 [47] 2 2 3 3 2 1 3 3 3 1 2 1 3 1 1 3 1 4 3 4 4 2 3 3 4 1 4 2 3 3 4 4 4 2 4 4 1 2 1 1 3 4 4 1 4 4
 [93] 3 4 1 3 4 1 3 2 3 1 1 1 1 4 1 3 2 1 3 4 3 2 4 3 1 3 4 3 3 1 1 3 3 4 3 1 4 1 4 1 1 4 1 4 3 3
[139] 1 3 2 4 3 1 2 3 2 4 2 3 4 3 1 3 1 1 4 1 4 3 1 4 1 3 1 4 1 1 4 1 4 4 3 1 1 2 3 1 1 3 1 4 1 4
[185] 1 3 1 1 1 1 1 4 1 1 1 2 4 3 1 4 3 1 1 2 2 1 4 4 4 1 4 1 2 4 3 4 4 3 3 4 4 4 1 4 1 1 1 2 4 1
[231] 1 3 2 3 1 4 3 1 4 1 4 3 2 3 3 1 1 4 1 1 4 1 3 1 3 1 3 2 3 4 1 4 4 3 1 3 3 3 4 3 2 1 2 4 3 1
[277] 3 4 3 4 1 4 4 4 1 4 4 2 3 1 3 4 1 3 1 4 1 2 1 1 1 3 4 1 4 4 1 2 1 1 1 1 3 3 4 4 4 3 4 2 1 1
[323] 3 4 3 1 1 1 2 1 4 1 3 4 4 1 1 2 1 4 4 3 3 2 3 4 1 4 1 3 2 2 4 3 3 3 1 2 1

the issue is that none of the generated mixture assignments reflect the mixture proportions that I find in the posterior distributions (sorry, I haven’t shown those, but I’ve attached the data for reproduction/replication). In those cases that it does tend to even a little, the mixture proportions are nearly identical. Is there is a specific way to generate mixture assignments from a hard coded stick-breaking process that’s different from doing so without the hard coded stick-breaking? Is there a specific detail I’m missing that would explain the results I’m obtaining?

I’d appreciate any help.

SCR_array.RData (2.8 KB)

1 Like

I tried replicating the pattern I noticed with simulated data and the generated mixture assignment worked as expected. Could be that I need to transform my data.