How to fit three component mixture model

Hello! I want to fit three component mixture model. Let W ~ 𝑓(𝑋𝑖,π‘Œπ‘–) be a set of two-dimensional random vectors and we want to fit

        (𝑋𝑖,π‘Œπ‘–) ~ π‘Š0𝑖𝑓(.|πœ†0π‘–πœ‡0𝑖)+π‘Š1𝑖𝑓(.|πœ†1π‘–πœ‡1𝑖)+ π‘Š2𝑖𝑓(.|πœ†2π‘–πœ‡2𝑖)

Where 𝑓(.|πœ†πœ‡) is the product of two Poisson distribution and the mean of the three components were defined as follows:

        πœ†0𝑖 > πœ‡0𝑖 ,  πœ†1𝑖 = πœ‡1𝑖  and  πœ†2𝑖 < πœ‡2𝑖  respectively

and I write the following R code:

data {
      int<lower=1> N;              
      int<lower=2> y1[N];
      int<lower=2> y2[N]; 
      matrix[N,4] x1;
      matrix[N,4] x2;
      real<lower=0> ci;
      vector[N] mci1;
      vector[N] mci2;
      vector[N] mci;
      real<lower=0> mc;
      vector[N] a1;
      vector[N] a2;
      vector[N] a3;
      real<lower=0> ma1;
      real<lower=0> ma2;
      real<lower=0> ma3;
  }
transformed data{
      int y_ast1[N];  
      int y_ast2[N];
       for (i in 1:N){
          y_ast1[i] = y1[i] - 2;
          y_ast2[i] = y2[i] - 2; 
       }
  }
parameters {
      vector[4] beta1;
      vector[4] beta2;
      real alpha1;
      real alpha2; 
      vector<lower=0, upper=1>[N] p1; 
      vector<lower=0, upper=1>[N] p2;
 }
   transformed parameters{
     vector[N] mu1 = exp(alpha1 + x1*beta1);
     vector[N] mu2 = exp(alpha2 + x2*beta2);
  }
  model {
      beta1 ~ normal(0,3);    
      alpha1 ~ normal(0,3);
      beta2 ~ normal(0,3);    
      alpha2 ~ normal(0,3);
      p1 ~ beta(a1,ma1);
      p2 ~ beta(a2,ma2);
      p3 ~ beta(a3,ma3);
      
  // Likelihood
   for(i in 1:N) 
    target += log_sum_exp(
                 log(p1) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))*poisson_lpmf(y_ast2[i]|mu1[i]), // mu1 > mu2, then I substitute mu2 by mu1 and I multiply mu1 by positive number 
                 log(p2) + poisson_lpmf(y_ast1[i]|mu1[i])*poisson_lpmf(y_ast1[i]|mu1[i]), // mu1=mu2, then I put mu1 in both
                 log(1-p1-p2) + poisson_lpmf(y_ast1[i]|mu1[i]*(1/exp(ci)))*poisson_lpmf(y_ast2[i]|mu1[i]));   // mu1 < mu2 , then I did the reverse what I put in the first           
   }
 }

When I run the above code, I got the following err

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:   log_sum_exp(vector, vector, vector)

Therefore, would you please guide me how I fix it

Thanks for you help!

log_sum_exp just takes a vector or an array so you need to do something like this:

real lp[3];
lp[1]=log(pi1) +...
lp[2]= ...
lp[3]= ...
target += log_sum_exp(lp)

the variable has to be declared at the top of the scope. e.g. top of the loop or top of the model block.

2 Likes

I defined the mixing parameter (p) as simplex[N] p[3]

parameters {
         simplex[N] p[3];
       }

 // Likelihood
      
   for(i in 1:N){
         lp[1] = log(p[1,i]) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))
                            + poisson_lpmf(y_ast2[i]|mu1[i]); 
         lp[2] = log(p[2,i]) + poisson_lpmf(y_ast1[i]|mu1[i])
                            + poisson_lpmf(y_ast2[i]|mu1[i]);
         lp[3] = log(p[3,i]) + poisson_lpmf(y_ast1[i]|mu1[i])
                            + poisson_lpmf(y_ast2[i]|mu1[i]*exp(ci));
       }
         target += log_sum_exp(lp);
       }

I got the following probability, but p1 + p2 + p3 is not 1. What is the problem on it?

p1 p2 p3
0.001 0.000 0.007
0.001 0.000 0.007
0.001 0.001 0.005

That defines three simplexes, each of which has N components. You want simplexes that have three components each so change the order.

parameters {
  simplex[3] p[N];
}
model {
  // Likelihood
  for(i in 1:N){
    lp[1] = log(p[i,1]) + poisson_lpmf(y_ast1[i]|mu1[i]*exp(ci))
                        + poisson_lpmf(y_ast2[i]|mu1[i]); 
    lp[2] = log(p[i,2]) + poisson_lpmf(y_ast1[i]|mu1[i])
                        + poisson_lpmf(y_ast2[i]|mu1[i]);
    lp[3] = log(p[i,3]) + poisson_lpmf(y_ast1[i]|mu1[i])
                        + poisson_lpmf(y_ast2[i]|mu1[i]*exp(ci));
    target += log_sum_exp(lp);
  }
}
2 Likes

Would you please give some solutions for this error?

Thank you in advance!

  data {
      int<lower=1> N;              
      int<lower=2> y1[N];
      int<lower=2> y2[N]; 
      matrix[N,2] x1;
      matrix[N,2] x2;
      vector[N] mci1;
      vector[N] mci2;
      vector[N] mci;
      real<lower=0> mc;
      
      vector[N] a1;  
      vector[N] a2;  
      vector[N] a3; 
  }
  parameters {
      vector[2] beta1;
      vector[2] beta2;
      real alpha1;
      real alpha2;
      real<lower=0> z1;
      real<lower=0> z2;
      simplex[3] p[N];    
 }

  transformed parameters{
     vector[N] mu1 = exp(alpha1 + x1*beta1);
     vector[N] mu2 = exp(alpha2 + x1*beta2);     
  }

  model {
      real lp[3];
      beta1 ~ normal(0,3);    
      alpha1 ~ normal(0,3);
      beta2 ~ normal(0,3);    
      alpha2 ~ normal(0,3);
      z1 ~ normal(0,3);
      z2 ~ normal(0,3);
      
      for (i in 1:N){
        p[1,i] ~ dirichlet(a1);
        p[2,i] ~ dirichlet(a2);
        p[3,i] ~ dirichlet(a3);   
     }
       
   for(i in 1:N){
     lp[1] = log(p[i,1]) + poisson_lpmf(y1[i]|mu1[i])
                        + poisson_lpmf(y2[i]|mu1[i]*exp(1/z1)); 
     lp[2] = log(p[i,2]) + poisson_lpmf(y1[i]|mu1[i])
                        + poisson_lpmf(y2[i]|mu1[i]);
     lp[3] = log(p[i,3]) + poisson_lpmf(y1[i]|mu2[i]*exp(1/z2))
                        + poisson_lpmf(y2[i]|mu2[i]);
   }
     target += log_sum_exp(lp);
   }

I got the following error:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for: 
  real ~ dirichlet(vector)
Available argument signatures for dirichlet:
  vector ~ dirichlet(vector)
Real return type required for probability function.
 error in 'model2f48673a10e_642004c3cf6c253f09f4181bc04ee72f' at line 67, column 31
  -------------------------------------------------
    65:       for (i in 1:N){
    67:         p[1,i] ~ dirichlet(a1);
                                      ^
    68:         p[2,i] ~ dirichlet(a2);
  -------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model '642004c3cf6c253f09f4181bc04ee72f' due to the above error.

Your dirichlet prior still thinks there are three simplexes of size N.
Change it to

data {
  ...
  vector[3] a[N];
}
model {
  ...
  for (i in 1:N) {
    p[i] ~ dirichlet(a[i]);
  }
  ...
}

Thank you for your replay. I changed this part. Still this part is unchanged:

for(i in 1:N){
     lp[1] = log(p[i,1]) + poisson_lpmf(y1[i]|mu1[i])
                        + poisson_lpmf(y2[i]|mu1[i]*exp(1/z1)); 
     lp[2] = log(p[i,2]) + poisson_lpmf(y1[i]|mu1[i])
                        + poisson_lpmf(y2[i]|mu1[i]);
     lp[3] = log(p[i,3]) + poisson_lpmf(y1[i]|mu2[i]*exp(1/z2))
                        + poisson_lpmf(y2[i]|mu2[i]);
   }

I had three Dirichlet parameters a1,a2,a3 and the data was put separately, but now I have tried to consider data "a" that contains a1,a2,a3 at a time. I tried to define like this:

a = c(a1,a2,a3) and got this error.

Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: variable does not exist; processing stage=data initialization; variable name=a; base type=vector_d  (in 'model17003825123c_18645674a8afd019cf75676db7926bc2' at line 15)

failed to create the sampler; sampling not done

It says β€œvariable does not exist” so it coulnd’t find the a for some reason. Anyway, you can also pass the data as before and combine the three parameters in Stan.

data {
      ...
      vector[N] a1;  
      vector[N] a2;  
      vector[N] a3; 
}
transformed data {
      vector[3] a[N];
      for (i in 1:N) {
            a[i,1] = a1[i];
            a[i,2] = a2[i];
            a[i,3] = a3[i];
      }
}
...
1 Like

still error related to real and vector

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for: 
  real ~ dirichlet(real)
Available argument signatures for dirichlet:
  vector ~ dirichlet(vector)
Real return type required for probability function.
 error in 'model1700a137068_334d93a5faae7ddf14907afd2d214515' at line 71, column 36
  -------------------------------------------------
    69:       for (i in 1:N){
    70:       
    71:          p[i,1] ~ dirichlet(a[i,1]);
                                           ^
    72:          p[i,2] ~ dirichlet(a[i,2]);
  -------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model '334d93a5faae7ddf14907afd2d214515' due to the above error.
model {
  ...
  for (i in 1:N) {
    p[i] ~ dirichlet(a[i]);
  }
  ...
}
2 Likes