Mixture Models with heterogeneous mixtures

Hi, I found on the Stan manual, the code to implemente the following model:

y \sim \lambda_1 N(\mu_1, \sigma) + ... + \lambda_K N(\mu_K, \sigma)

I would like to instead implement a mixture model of K distributions, where not all observations come from all K distributions. Let’s say that the distributions represent K features of y, but none of the y's are observed with all of them, but I still want to extrapolate the pooled evidence.

Let’s say that I have a matrix P, such that P_{i,j} = 1 if y_i has feature j and P_{i,j} =0 otherwise, I would like to estimate the following model:


y_n \sim \frac{P_{n,1}}{\sum \limits_{k=1}^K P_{n,k}} \lambda_1 N(\mu_1, \sigma) + ... + \frac{P_{n,K}}{\sum \limits_{k=1}^K P_{n,k}} \lambda_K N(\mu_K, \sigma)

So I give weight 0 to the feature’s distribution if it is not present for y_n and renormalize the weights to sum to 1. I tried the following code, but it does not work, I suspect it is just my poor coding.

model {
  for (n in 1:N) { 
    real C = col(P,n)' * lambda;           // normalizing constant
    vector[K] lps;  
    for (k in 1:K){
      lps[k] = log((lambda[k]*P[k,n])/C);
      lps[k] += normal_lpdf(y[n] | mu[k], sigma);
     }
    target += log_sum_exp(lps);
  }

Is there a way to implement this on Stan? Or do you suggest any other modelling alternatives?

Thanks in advance for the help.


If I understand correctly, you are trying to condition the distribution that increments the model on the presence or absence of some feature of the data. Have you considered nesting a conditional statment in your for loop?

model {
  for (n in 1:N) { 
    real C = col(P,n)' * lambda;
    vector[K] lps;  
    for (k in 1:K){
      if (condition) {

      } else {

      }
    }
    target += log_sum_exp(lps);
  }

Hi, thanks for your answer. I did, this is my code

transformed parameters{
  matrix[K,N] lambda_t;
   for (n in 1:N){ 
     real C = col(P,n)' * lambda; // normalizing constant
     for (k in 1:K){
      if (P[k,n] == 0)
        lambda_t[k,n] = 1;
      else
        lambda_t[k,n] = (lambda[k]*P[k,n])/C;
     }
   }
}

model {
 y ~ normal(theta, sigma_y);

  for (n in 1:N) { 
    vector[K[n]] lps = log(col(lambda_t_2, n));  
    for (k in 1:K){
      if (P[k,n] == 0)
        lps[k] += 0;
      else
        lps[k] += normal_lpdf(y[n] | mu[k], sigma);
     }
    target += log_sum_exp(lps);
  }

But it doesn’t really work unfortunately as exp(0)=1, while I would like to have exp(-infinity)=0, which I cannot do on Stan from what I tried unfortunately.

I also tried this:

model {
 y ~ normal(theta, sigma_y);

  for (n in 1:N) { 
    vector[K] lps = log(col(lambda_t, n));  
    for (k in 1:K){
      if (P[k,n] == 0)
        lps[k] += negative_infinity();
      else
        lps[k] += normal_lpdf(y[n] | mu[k], sigma);
     }
    target += log_sum_exp(lps);
  }

Using negative_infinity() but it does not work unfortunately.