I’m writing a mixture model that generates its samples via the stick-breaking process. I’m trying to recover any distinct mixtures of skin conductance response data recorded from a large sample of people from a diverse population. I adopted the stick-breaking code from this post (specifically the code that @Bob_Carpenter posted).
I’m finding very odd results when trying to generate the mixture assignments. It could be my lack of better understanding, but hopefully someone can clear any misunderstandings. When I use the below code to simulate the model, I obtain identical mixture assignments to every observation when taking the mode across all chain samples.
data {
int<lower=0> K; // Number of cluster
int<lower=0> N; // Number of observations
real y[N]; // observations
}
parameters {
vector<lower=0,upper=1>[K - 1] v; // stickbreak components
real<lower=0, upper=50> lambda[K]; // exponential parameter
real<lower=0, upper=15> alpha; // hyper prior DP(alpha, base)
}
transformed parameters {
// stick-breaking process
simplex[K] eta;
real sum = 0;
eta[1] = v[1];
sum = eta[1];
for (k in 2:(K - 1)) {
eta[k] = (1 - sum) * v[k];
sum += eta[k];
}
eta[K] = 1 - sum;
}
model {
real ps[K];
alpha ~ gamma(3, 0.5);
v ~ beta(1, alpha);
for(i in 1:N){
for(k in 1:K){
ps[k] = log(eta[k]) + exponential_lpdf(y[i] | lambda[k]);
}
target += log_sum_exp(ps);
}
}
generated quantities{
vector[N] log_lik; // log-likelihood for calculation of LOO
int<lower=1> Z[N]; // group index
for (n in 1:N) {
vector[K] ll;
for(k in 1:K) {
ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);
}
log_lik[n] = log_sum_exp(ll);
Z[n] = categorical_rng(exp(ll - log_lik[n]));
}
}
Running the model in R as thus
load('SCR_array.RData')
SCR <- SCR_array
n <- length(SCR)
rstan_options(auto_write = TRUE)
options(mc.cores = 2)
dataList <- list(K=4, N=n, y=SCR)
modelFile <- 'mixture_SCR_model.stan'
nIter <- 6000
nChains <- 4
nWarmup <- floor(nIter/2)
nThin <- 1
fit_reg <- stan(modelFile,
data = dataList,
chains = nChains,
iter = nIter,
warmup = nWarmup,
thin = nThin,
init = "random",
seed = 1450154626)
When I try to extract the mode of the mixture assignments (I adopted this post’s generated block), I find the below oddity. Sometimes it’s 3, sometimes it’s 4, but all times its delta-like stick.
> fit_ppd <- extract(fit_reg)
> apply(fit_ppd$Z, 2, stat.mode)
[1] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[47] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[93] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[139] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[185] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[231] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[277] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
[323] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Elsewhere on the forum, someone has suggested using this method to generate the mixture assignments.
generated quantities{
vector[N] log_lik; // log-likelihood for calculation of LOO
int<lower=1> Z[N]; // group index
for (n in 1:N) {
vector[K] ll;
for(k in 1:K) {
ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);
}
log_lik[n] = log_sum_exp(ll);
Z[n] = categorical_logit_rng(ll); // only line that differs from the above ex
}
}
This obtains a nearly identical result as the above. Any other posts about generated mixture assignments from mixture models use nearly identical code to do so. So, I suspect both codes are the same underneath the hood, relating to where the logit function is placed?
There’s some difference in the model blocks between my own and all the examples I found, but nothing substantive. However, what I noticed is that none of them used a stick-breaking process explicitly in the transformed parameter block. So, I examined if removing that component from the model would change the outcome and as expected it did (The first post I mentioned said that simplex utilizes stick-breaking, so it seemed reasonable enough although using the dirichlet with this doesn’t change the outcome from what I’m finding). Here’s the code for that model below.
data {
int<lower=0> K; // Number of cluster
int<lower=0> N; // Number of observations
real y[N]; // observations
}
parameters {
simplex[K] eta;
real<lower=0, upper=50> lambda[K]; // exponential parameter
}
model {
real ps[K];
// placing a dirichlet here wouldn't change the results, but one can check
for(i in 1:N){
for(k in 1:K){
ps[k] = log(eta[k]) + exponential_lpdf(y[i] | lambda[k]);
}
target += log_sum_exp(ps);
}
}
generated quantities{
vector[N] log_lik; // log-likelihood for calculation of LOO
int<lower=1> Z[N]; // group index
for (n in 1:N) {
vector[K] ll;
for(k in 1:K) {
ll[k] = log(eta[k]) + exponential_lpdf(y[n] | lambda[k]);
}
log_lik[n] = log_sum_exp(ll);
Z[n] = categorical_rng(exp(ll - log_lik[n]));
}
}
[1] 2 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 4 1 3 3 3 3 3 3 4 3 3 3 3 3 4 3 3 3 3 3
[47] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 4 3 3 3 3 3 4 3 3 3
[93] 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3
[139] 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 4 4
[185] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 3 3 4
[231] 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3
[277] 3 3 3 3 3 3 1 3 3 3 3 3 3 4 3 4 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3
[323] 3 3 3 3 3 3 3 3 4 3 3 3 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
In fact, they show more varied generation as this a separate attempt using the above stan code.
[1] 4 3 3 3 3 4 1 3 1 3 1 2 2 3 1 1 1 3 1 3 4 1 1 3 3 3 1 4 4 3 3 3 1 1 1 4 2 1 4 3 4 1 4 2 1 3
[47] 2 2 3 3 2 1 3 3 3 1 2 1 3 1 1 3 1 4 3 4 4 2 3 3 4 1 4 2 3 3 4 4 4 2 4 4 1 2 1 1 3 4 4 1 4 4
[93] 3 4 1 3 4 1 3 2 3 1 1 1 1 4 1 3 2 1 3 4 3 2 4 3 1 3 4 3 3 1 1 3 3 4 3 1 4 1 4 1 1 4 1 4 3 3
[139] 1 3 2 4 3 1 2 3 2 4 2 3 4 3 1 3 1 1 4 1 4 3 1 4 1 3 1 4 1 1 4 1 4 4 3 1 1 2 3 1 1 3 1 4 1 4
[185] 1 3 1 1 1 1 1 4 1 1 1 2 4 3 1 4 3 1 1 2 2 1 4 4 4 1 4 1 2 4 3 4 4 3 3 4 4 4 1 4 1 1 1 2 4 1
[231] 1 3 2 3 1 4 3 1 4 1 4 3 2 3 3 1 1 4 1 1 4 1 3 1 3 1 3 2 3 4 1 4 4 3 1 3 3 3 4 3 2 1 2 4 3 1
[277] 3 4 3 4 1 4 4 4 1 4 4 2 3 1 3 4 1 3 1 4 1 2 1 1 1 3 4 1 4 4 1 2 1 1 1 1 3 3 4 4 4 3 4 2 1 1
[323] 3 4 3 1 1 1 2 1 4 1 3 4 4 1 1 2 1 4 4 3 3 2 3 4 1 4 1 3 2 2 4 3 3 3 1 2 1
the issue is that none of the generated mixture assignments reflect the mixture proportions that I find in the posterior distributions (sorry, I haven’t shown those, but I’ve attached the data for reproduction/replication). In those cases that it does tend to even a little, the mixture proportions are nearly identical. Is there is a specific way to generate mixture assignments from a hard coded stick-breaking process that’s different from doing so without the hard coded stick-breaking? Is there a specific detail I’m missing that would explain the results I’m obtaining?
I’d appreciate any help.
SCR_array.RData (2.8 KB)