I am working on implementing Gaussian mixture model on iris dataset. The code is as shown below.
require(rstan)
library(datasets)
data(iris)
y=array(iris[3:4])
mixture_data=list(N=150, D=2, K=3, y=y)
mixture_model<-'
data {
int D; //number of dimensions
int K; //number of gaussians
int N; //number of data
vector[D] y[N]; //data
}
parameters {
simplex[K] theta; //mixing proportions
vector[D] mu[K]; //mixture component means
cholesky_factor_corr[D] L[K]; //cholesky factor of correlation
vector<lower=0>[K] sigma[D]; // standard deviations
}
transformed parameters{
cholesky_factor_cov[D, D] cov[K];
for (k in 1:K) {
cov[k] = diag_pre_multiply(sigma[k], L[k]);
}
}
model {
real ps[K];
for(k in 1:K){
mu[k] ~ normal(0, 10);
L[k] ~ lkj_corr_cholesky(2);
sigma[k] ~ exponential(1);
}
for (n in 1:N){
for (k in 1:K){
ps[k] = log(theta[k])+multi_normal_cholesky_lpdf(y[n] | mu[k], cov[k]); //increment log probability of the gaussian
}
target += log_sum_exp(ps);
}
}
'
fit=stan(model_code=mixture_model, data=mixture_data, iter=1000, chains=4, init_r=0.5)
The model works perfectly for K=2 and does recover parameters correctly. The sampler throws an error for K = 3.
SAMPLING FOR MODEL 'd45eb236ae863de793b59e995d2e8072' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: diag_pre_multiply: m1.size() (3) and m2.rows() (2) must match in size (in 'model1d9654deb7fc_d45eb236ae863de793b59e995d2e8072' at line 19)
[1] "Error in sampler$call_sampler(args_list[[i]]) : "
[2] " Exception: diag_pre_multiply: m1.size() (3) and m2.rows() (2) must match in size (in 'model1d9654deb7fc_d45eb236ae863de793b59e995d2e8072' at line 19)"
error occurred during calling the sampler; sampling not done