Hello,
I am making progress with my mixture model: it now fits the data pretty nicely:
(the histogram shows the data, while the lines show samples from the posterior)
I still have some warnings from the sampler though:
Warning: 83 of 8000 (1.0%) transitions ended with a divergence.
See https://mc-stan.org/misc/warnings for details.
Warning: 2008 of 8000 (25.0%) transitions hit the maximum treedepth limit of 10.
See https://mc-stan.org/misc/warnings for details.
Warning: 4 of 4 chains had an E-BFMI less than 0.3.
See https://mc-stan.org/misc/warnings for details.
My current guess is that this due to the dependency between my two parameters:
- theta - weight of component (i.e, I am running a mixture model)
- kappa - von Mises concentration.
As can be seen in the plot below, if kappa is very small, then theta is relatively unconstrained, and vice versa.
Are there any “standard tricks” I should know about for dealing with such situations?
For now, I am experimenting with different priors to see if that will help.
As always, and hints, tips and advice very welcome.
Alasdair
Model below:
data {
int<lower = 0> N;
int <lower = 1> K; // number of experimental conditions
array[N] int <lower = 1, upper = K> X; // which condition are we in
array[N] real phi;
int<lower = 0> L; // number of participants
array[N] int<lower = 1, upper = L> Z;
real prior_theta;
real prior_sigma_u;
}
transformed data {
array[8] real mu;
mu[1] = 0 * pi()/4;
mu[2] = 1 * pi()/4;
mu[3] = 2 * pi()/4;
mu[4] = 3 * pi()/4;
mu[5] = 4 * pi()/4;
mu[6] = 5 * pi()/4;
mu[7] = 6 * pi()/4;
mu[8] = 7 * pi()/4;
}
parameters {
// mixing proportions
// 8 von Mises + 1 uniform
// fixed effects
array[K] simplex[9] theta;
array[K] vector[8] log_kappa;
// random effects
array[K, L] vector[9] u;
array[K, 9] real<lower = 0> sigma_u;
}
transformed parameters {
// create a vector for each participant's theta values
array[K, L] vector[9] log_theta_u;
// transform kappa
array[K] vector[8] kappa = exp(log_kappa);
for (k in 1:K) {
for (l in 1:L) {
log_theta_u[k, l] = log_softmax(log(theta[k]) + u[k, l]);
}
}
}
model {
//////////////////////////////////////////////
// setting priors/////////////////////////////
//////////////////////////////////////////////
// priors for theta
for (kk in 1:K) {
theta[kk] ~ dirichlet(rep_vector(prior_theta, 9));
log_kappa[kk] ~ normal(4, 0.5);
for (psi in 1:9) {
sigma_u[kk, psi] ~ exponential(prior_sigma_u);
for (ll in 1:L) {
u[kk, ll][psi] ~ normal(0, sigma_u[kk, psi]); //
}
}
}
//////////////////////////////////////////////
// fitting model /////////////////////////////
//////////////////////////////////////////////
int kk, ll; // which condition are we in and which person is it
for (n in 1:N) {
kk = X[n];
ll = Z[n];
vector[9] lps;
//print(log_theta_u[kk, ll]);
for (psi in 1:8) {
//lps[psi] = log(theta[kk, psi]) + von_mises_lpdf(phi[n] | mu[psi], kappa);
lps[psi] = log_theta_u[kk, ll][psi] + von_mises_lpdf(phi[n] | mu[psi], kappa[kk, psi]);
}
lps[9] = log_theta_u[kk, ll][9] + uniform_lpdf(phi[n] | -pi(), pi());
target += log_sum_exp(lps);
}
}