Mixture models: how to get P(data | cluster)

stemangiola · May 25, 2017, 9:09am

Hello,

the answer could be as obvious as saving

neg_binomial_2_lpmf(… | …, …)

in the transformed parameters and use it into the model, so you would have a measure of the likelihood to of one data point of belonging to a cluster. Then I would soft_max.

Is this correct?

What about the generated quantities approach in the documentation. I didn;t really get it.

data {
int<lower = 0> N_tilde;
vector[N_tilde] y_tilde;
...
generated quantities {
vector[N_tilde] log_p_y_tilde;
for (n in 1:N_tilde)
log_p_y_tilde[n]
= log_mix(lambda,
normal_lpdf(y_tilde[n] | mu[1], sigma[1])
normal_lpdf(y_tilde[n] | mu[2], sigma[2]));
}

Thanks

sakrejda · May 25, 2017, 2:03pm

I sort of get what you’re asking but could you be more specific about what you want and what your model is? People will waste their time trying to answer this in the current form.

stemangiola · May 25, 2017, 2:06pm

Sure.

my mixture model is

data {
	int<lower=1> G;
  int<lower=0> y[G]; // data. gene mupression (integer counts) 
  real num_zeros;
}

parameters {
real<lower=0> mu1; // prior location
real<lower=mu1> mu2;
real mu_zero[2];

real<lower=0> si[2];
real<lower=0, upper=1> lambda;
real<lower=0, upper=1> lambda_zero;

}
model {
target += log_mix(lambda_zero,bernoulli_logit_lpmf(0 | mu_zero[1]),
	                    bernoulli_logit_lpmf(0 | mu_zero[2])) * num_zeros;
target += log_mix(lambda_zero,bernoulli_logit_lpmf(1 | mu_zero[1]),
	                    bernoulli_logit_lpmf(1 | mu_zero[2])) * G;
	                    
for(g in 1:G){
	target += log_mix(lambda,
		neg_binomial_2_lpmf(y[g] | mu1, si[1]), //NON
	 	neg_binomial_2_lpmf(y[g] | mu2, si[2]) //EXP
	 	);
}
		si ~ cauchy(0, 2.5);
		mu1 ~ gamma(2, 1);
		mu2 ~ gamma(5, 0.002);
		lambda ~ beta(2,2);
		lambda_zero ~ beta(2,2);
		mu_zero ~ normal(0,1);
}
generated quantities{
	int y_hat_1[G];
	int y_hat_2[G];
	
	for(g in 1:G) y_hat_1[g] = neg_binomial_2_rng(mu1, si[1]);
	for(g in 1:G) y_hat_2[g] = neg_binomial_2_rng(mu2, si[2]);
}

Now I would like to get the probability that y[g] belongs to cluster neg_binomial_2 1 and 2

betanalpha · May 25, 2017, 8:29pm

It’s lambda. Your model assumes the mixture probability is homogenous over each draw, so it’s the same probability for each datum.

stemangiola · May 25, 2017, 10:32pm

But a probability for y[1] | ckuster[1] for a datum that is at the centre of the cluster, is not higher than for a datum that is far away of such cluster?

wouldn’t it be log(lambda) + neg_binomial_2_lpmf(y[g] | mu1, si[1])

and then I can at least compare the log likelihoods between data?

Bob_Carpenter · May 26, 2017, 12:39am

@betanalpha’s still jetlagged. What he gave you is the prior probability before observing y. If we let z[g] in {1, 2} be the component from which y[g] is drawn, then conditioning on y[g], we have

once you’ve observed y, the probability that y[g] was drawn from component 1 is

p1 =  lambda * neg_binomial_2(y[g] | mu1, si[1])

p2 = (1 - lambda) * neg_binomial_2(y[g] | mu2, si[2])

Pr[z[g] = 1 | y[g], lambda, mu, si]  =  p1 / (p1 + p2)

To do this with arithmetic stability, you want to put everything on the log scale and use log1m(lambda) instead of log(1 - lambda), so it should look like this:

generated quantities {
  real<lower=0, upper=1> log_Pr[G];
  for (g in 1:G) {
    real lp1 = log(lambda) + neg_binomial_2_lpmf(y[g] | mu1, si[1]);
    real lp2 = log1m(lambda) + neg_binomial_2_lpmf(y[g] | mu2, si[2]);
    log_Pr[g] = lp1 - log_sum_exp(lp1, lp2);
  }
}

stemangiola · May 26, 2017, 2:26am

Thanks Bob, and Betan

:)

should I use generated quantities in a 2 steps model with parameters passed as data, or should I calculate generated quantities in the prediction model itself?

Bw.

Bob_Carpenter · May 26, 2017, 2:57am

Until we figure out how to allow generated quantities to run after sampling, it all has to go in the same model. The generated quantities are very fast to execute as we don’t need derivatives or multiple steps per iteration.

As to the earlier question, if you want to evaluate the likelihood, that’s what you already coded up for the mixture—the likelihood of the data marginalizes out the latent discrete responsibility parameter.

stemangiola · May 26, 2017, 6:46am

So just to make sure,

I can save directly my neg_binomial_2_lpmf(y[g] | mu1, si[1]) into generated quantities? For later calculation of the cluster assignment.

In any case

Isn’t a bit more complicated because of the zero-inflation. I have lambda and lambda 0. My goal however is get the probability of cluster 2

neg_binomial_2_lpmf(y[g] | mu2, si[2])

vs. anything else (cluster 1 + zero inflation).

P.S. I didn’t get the sentence

Bob_Carpenter · May 26, 2017, 7:06am

The discrete parameter that determines the mixture component is often called the “responsibility”. It’s an index for which component you drew from.

You could do the same calculation for the zero cases, and they’ll all be the same so you only need to do it once.

stemangiola · May 26, 2017, 7:31am

Sorry to bother you, but is this plausible? P(cluster 2 | … )

	for (g in 1:G) {
		if(y[g]==0) {
			real lp0 = log(lambda_zero) + bernoulli_logit_lpmf(0 | mu_zero[1]);
			real lp1 = log(lambda_zero) + bernoulli_logit_lpmf(1 | mu_zero[2]) + log(lambda) + neg_binomial_2_lpmf(y[g] | mu1, si[1]);
	        real lp2 = log(lambda_zero) + bernoulli_logit_lpmf(1 | mu_zero[2]) + log1m(lambda) + neg_binomial_2_lpmf(y[g] | mu2, si[2]);
	        log_Pr[g] = lp2 - log_sum_exp(lp0, lp1, lp2);
			
		}
		else{
            real lp0 = 0;
	        real lp1 = log1m(lambda_zero) + bernoulli_logit_lpmf(1 | mu_zero[2]) + log(lambda) + neg_binomial_2_lpmf(y[g] | mu1, si[1]);
	        real lp2 = log1m(lambda_zero) + bernoulli_logit_lpmf(1 | mu_zero[2]) + log1m(lambda) + neg_binomial_2_lpmf(y[g] | mu2, si[2]);
	        log_Pr[g] = lp2 - log_sum_exp(lp0, lp1, lp2);
		}
  }

Bob_Carpenter · May 26, 2017, 5:45pm

Yes.

stemangiola · May 29, 2017, 3:02am

Just one note,

real<lower=0, upper=1> log_Pr[G];

should probably read

real log_Pr[G];

since it is a log probability

Bob_Carpenter · May 29, 2017, 11:34pm

You’re right—originally started with probabilities. You could add real<upper=0>, but it’s not necessary—it only does consistency checking.

Topic		Replies	Views
Generating log_like from mixture model Modeling	2	1333	November 12, 2017
Interpretting the generated quantities of `binomial_lpmf()` Modeling specification	2	597	February 28, 2022
Generated quantities for mixture models Modeling	6	3158	March 16, 2019
Mixture Model with joint probability Modeling	4	1058	June 21, 2017
Mixture of generalized poisson distribution Modeling mixture , poisson	18	1914	December 26, 2022

Mixture models: how to get P(data | cluster)

Related topics