# How to fit three component mixture model

Hello! I want to fit three component mixture model. Let `W ~ 𝑓(𝑋𝑖,𝑌𝑖)` be a set of two-dimensional random vectors and we want to fit

``````        (𝑋𝑖,𝑌𝑖) ~ 𝑊0𝑖𝑓(.|𝜆0𝑖𝜇0𝑖)+𝑊1𝑖𝑓(.|𝜆1𝑖𝜇1𝑖)+ 𝑊2𝑖𝑓(.|𝜆2𝑖𝜇2𝑖)
``````

Where `𝑓(.|𝜆𝜇)` is the product of two Poisson distribution and the mean of the three components were defined as follows:

``````        𝜆0𝑖 > 𝜇0𝑖 ,  𝜆1𝑖 = 𝜇1𝑖  and  𝜆2𝑖 < 𝜇2𝑖  respectively
``````

and I write the following R code:

``````data {
int<lower=1> N;
int<lower=2> y1[N];
int<lower=2> y2[N];
matrix[N,4] x1;
matrix[N,4] x2;
real<lower=0> ci;
vector[N] mci1;
vector[N] mci2;
vector[N] mci;
real<lower=0> mc;
vector[N] a1;
vector[N] a2;
vector[N] a3;
real<lower=0> ma1;
real<lower=0> ma2;
real<lower=0> ma3;
}
transformed data{
int y_ast1[N];
int y_ast2[N];
for (i in 1:N){
y_ast1[i] = y1[i] - 2;
y_ast2[i] = y2[i] - 2;
}
}
parameters {
vector beta1;
vector beta2;
real alpha1;
real alpha2;
vector<lower=0, upper=1>[N] p1;
vector<lower=0, upper=1>[N] p2;
}
transformed parameters{
vector[N] mu1 = exp(alpha1 + x1*beta1);
vector[N] mu2 = exp(alpha2 + x2*beta2);
}
model {
beta1 ~ normal(0,3);
alpha1 ~ normal(0,3);
beta2 ~ normal(0,3);
alpha2 ~ normal(0,3);
p1 ~ beta(a1,ma1);
p2 ~ beta(a2,ma2);
p3 ~ beta(a3,ma3);

// Likelihood
for(i in 1:N)
target += log_sum_exp(
log(p1) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))*poisson_lpmf(y_ast2[i]|mu1[i]), // mu1 > mu2, then I substitute mu2 by mu1 and I multiply mu1 by positive number
log(p2) + poisson_lpmf(y_ast1[i]|mu1[i])*poisson_lpmf(y_ast1[i]|mu1[i]), // mu1=mu2, then I put mu1 in both
log(1-p1-p2) + poisson_lpmf(y_ast1[i]|mu1[i]*(1/exp(ci)))*poisson_lpmf(y_ast2[i]|mu1[i]));   // mu1 < mu2 , then I did the reverse what I put in the first
}
}
``````

When I run the above code, I got the following err

``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:   log_sum_exp(vector, vector, vector)
``````

Therefore, would you please guide me how I fix it

Thanks for you help!

log_sum_exp just takes a vector or an array so you need to do something like this:

``````real lp;
lp=log(pi1) +...
lp= ...
lp= ...
target += log_sum_exp(lp)

``````

the variable has to be declared at the top of the scope. e.g. top of the loop or top of the model block.

1 Like

I defined the mixing parameter (`p`) as `simplex[N] p`

``````parameters {
simplex[N] p;
}

// Likelihood

for(i in 1:N){
lp = log(p[1,i]) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))
+ poisson_lpmf(y_ast2[i]|mu1[i]);
lp = log(p[2,i]) + poisson_lpmf(y_ast1[i]|mu1[i])
+ poisson_lpmf(y_ast2[i]|mu1[i]);
lp = log(p[3,i]) + poisson_lpmf(y_ast1[i]|mu1[i])
+ poisson_lpmf(y_ast2[i]|mu1[i]*exp(ci));
}
target += log_sum_exp(lp);
}
``````

I got the following probability, but ` p1 + p2 + p3` is not 1. What is the problem on it?

p1 p2 p3
0.001 0.000 0.007
0.001 0.000 0.007
0.001 0.001 0.005

That defines three simplexes, each of which has `N` components. You want simplexes that have three components each so change the order.

``````parameters {
simplex p[N];
}
model {
// Likelihood
for(i in 1:N){
lp = log(p[i,1]) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))
+ poisson_lpmf(y_ast2[i]|mu1[i]);
lp = log(p[i,2]) + poisson_lpmf(y_ast1[i]|mu1[i])
+ poisson_lpmf(y_ast2[i]|mu1[i]);
lp = log(p[i,3]) + poisson_lpmf(y_ast1[i]|mu1[i])
+ poisson_lpmf(y_ast2[i]|mu1[i]*exp(ci));
target += log_sum_exp(lp);
}
}
``````
2 Likes

Would you please give some solutions for this error?

``````  data {
int<lower=1> N;
int<lower=2> y1[N];
int<lower=2> y2[N];
matrix[N,2] x1;
matrix[N,2] x2;
vector[N] mci1;
vector[N] mci2;
vector[N] mci;
real<lower=0> mc;

vector[N] a1;
vector[N] a2;
vector[N] a3;
}
parameters {
vector beta1;
vector beta2;
real alpha1;
real alpha2;
real<lower=0> z1;
real<lower=0> z2;
simplex p[N];
}

transformed parameters{
vector[N] mu1 = exp(alpha1 + x1*beta1);
vector[N] mu2 = exp(alpha2 + x1*beta2);
}

model {
real lp;
beta1 ~ normal(0,3);
alpha1 ~ normal(0,3);
beta2 ~ normal(0,3);
alpha2 ~ normal(0,3);
z1 ~ normal(0,3);
z2 ~ normal(0,3);

for (i in 1:N){
p[1,i] ~ dirichlet(a1);
p[2,i] ~ dirichlet(a2);
p[3,i] ~ dirichlet(a3);
}

for(i in 1:N){
lp = log(p[i,1]) + poisson_lpmf(y1[i]|mu1[i])
+ poisson_lpmf(y2[i]|mu1[i]*exp(1/z1));
lp = log(p[i,2]) + poisson_lpmf(y1[i]|mu1[i])
+ poisson_lpmf(y2[i]|mu1[i]);
lp = log(p[i,3]) + poisson_lpmf(y1[i]|mu2[i]*exp(1/z2))
+ poisson_lpmf(y2[i]|mu2[i]);
}
target += log_sum_exp(lp);
}
``````

I got the following error:

``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
real ~ dirichlet(vector)
Available argument signatures for dirichlet:
vector ~ dirichlet(vector)
Real return type required for probability function.
error in 'model2f48673a10e_642004c3cf6c253f09f4181bc04ee72f' at line 67, column 31
-------------------------------------------------
65:       for (i in 1:N){
67:         p[1,i] ~ dirichlet(a1);
^
68:         p[2,i] ~ dirichlet(a2);
-------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name,  :
failed to parse Stan model '642004c3cf6c253f09f4181bc04ee72f' due to the above error.``````

Your dirichlet prior still thinks there are three simplexes of size N.
Change it to

``````data {
...
vector a[N];
}
model {
...
for (i in 1:N) {
p[i] ~ dirichlet(a[i]);
}
...
}
``````

Thank you for your replay. I changed this part. Still this part is unchanged:

``````for(i in 1:N){
lp = log(p[i,1]) + poisson_lpmf(y1[i]|mu1[i])
+ poisson_lpmf(y2[i]|mu1[i]*exp(1/z1));
lp = log(p[i,2]) + poisson_lpmf(y1[i]|mu1[i])
+ poisson_lpmf(y2[i]|mu1[i]);
lp = log(p[i,3]) + poisson_lpmf(y1[i]|mu2[i]*exp(1/z2))
+ poisson_lpmf(y2[i]|mu2[i]);
}
``````

I had three Dirichlet parameters `a1,a2,a3` and the data was put separately, but now I have tried to consider data `"a"` that contains `a1,a2,a3` at a time. I tried to define like this:

`a = c(a1,a2,a3)` and got this error.

``````Error in new_CppObject_xp(fields\$.module, fields\$.pointer, ...) :
Exception: variable does not exist; processing stage=data initialization; variable name=a; base type=vector_d  (in 'model17003825123c_18645674a8afd019cf75676db7926bc2' at line 15)

failed to create the sampler; sampling not done``````

It says “variable does not exist” so it coulnd’t find the `a` for some reason. Anyway, you can also pass the data as before and combine the three parameters in Stan.

``````data {
...
vector[N] a1;
vector[N] a2;
vector[N] a3;
}
transformed data {
vector a[N];
for (i in 1:N) {
a[i,1] = a1[i];
a[i,2] = a2[i];
a[i,3] = a3[i];
}
}
...
``````
1 Like

still error related to real and vector

``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
real ~ dirichlet(real)
Available argument signatures for dirichlet:
vector ~ dirichlet(vector)
Real return type required for probability function.
error in 'model1700a137068_334d93a5faae7ddf14907afd2d214515' at line 71, column 36
-------------------------------------------------
69:       for (i in 1:N){
70:
71:          p[i,1] ~ dirichlet(a[i,1]);
^
72:          p[i,2] ~ dirichlet(a[i,2]);
-------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name,  :
failed to parse Stan model '334d93a5faae7ddf14907afd2d214515' due to the above error.``````
``````model {
...
for (i in 1:N) {
p[i] ~ dirichlet(a[i]);
}
...
}
``````
2 Likes