In the <Stan Modeling Language: User’s Guide and Reference Manual>, the Gaussian mixture code is in page 116
data {
int<lower=1> K; // number of mixture components
int<lower=1> N; // number of data points
real y[N]; // observations
}
parameters {
simplex[K] theta; // mixing proportions
real mu[K]; // locations of mixture components
real<lower=0> sigma[K]; // scales of mixture components
}model {
real ps[K]; // temp for log component densities
sigma ~ cauchy(0, 2.5);
mu ~ normal(0, 10);
for (n in 1:N) {
for (k in 1:K) {
ps[k] = log(theta[k])
- normal_lpdf(y[n] | mu[k], sigma[k]);
}
target += log_sum_exp(ps);
}
}
I slightly adjusted this code for my own goal, especially adding Dirichlet prior onto w, with the following code.
data {
int<lower=1> C; // number of mixture components
int<lower=1> V; // number of data points
real s[V]; // observations, from textbook
}
parameters {
simplex[C] w; // mixing proportions
real mu[C]; // locations of mixture components, from textbook
real<lower=0> sigma[C]; // scales of mixture components, from textbook
}
model {
vector[C] ps; // temp for log component densities, added by Ziyi
vector[V] lp_obs; //added by Ziyi
w ~ dirichlet(rep_vector(2,C)); //Added by Ziyi
mu ~ normal(0, 10);
sigma ~ cauchy(0, 2.5);for (n in 1:V) {
for (k in 1:C) {
ps[k] = log(w[k])+ normal_lpdf(s[n] | mu[k], sigma[k]);
}
lp_obs[n]=log_sum_exp(ps);
}
target+=sum(lp_obs); //added by Ziyi
}
Even with
iter=2000, warmup=1000, chains=1, cores=1, control=list(adapt_delta=0.9)
it takes 7198.78s for the 2000 iterations. I remember when I use random walk proposal in R without C++, each iteration won’t exceed 1s. This code is slightly adjusted from manual. I used lp_obs and sum up together, which should be faster than the code from manual (see manual page 239, “Vectorizing Summations”)
How can we make it faster?
If I use Stan in Julia or Python, will it be faster than to use Stan in R?
BTW, when I change "w ~ dirichlet(rep_vector(2,C)); " into "w ~ dirichlet(rep_vector(1/C,C)); " to encourage sparsity in w, then the initialization fails.
Thanks.
Ziyi