# How to compute P-value for Mixture Model

Hello! Iβm fitting the data using two component mixture model `ππ ~ πΎπππ(.|πππ)+ πΎπππ(.|πππ)`. Where `π0π` and `π1π` represent the mixing probability, and `π(.|π)` represent the Poisson distribution. After I pass through some processes using Stan package, I got like the following output:

yi W1i
5 0.4
2 0.7
10 0.6
2 0.4

Finally, I define the new latent variable `ππ, π = 1,β¦,π` that indicates the category of observation group, i.e., whether it is in the first or second category. The indicator variable has two outcomes (0 and 1), and it follows Bernoulli distribution, ππ~π΅πππππ’πππ(π1π), for π = 1, 2,β¦, π and it is concluded that the observation π is in the second group (I call it significant observations) whenever π(ππ = 1|π) is bigger than a cutoff value, say 0.5.

my question is that:

is it need to use statistical significance or FDR to select significant observations, instead of using one ad hoc number (cutoff of the posterior probability of π1π > 0.5)?

Hi yab!

Iβm not really sure, but I would say βit dependsβ. In a Bayesian approach you donβt βneedβ statistical significance thresholds. What constitutes a significant observations should IMO come from your domain expertise. If the notion is βitβs a significant observation when itβs more likely to be in category 1 than category 0β, then the W1i > 0.5 threshold makes sense. However, it could be a significant observation if it almost surely falls into category 1 and then youβd probably want to go for something like W1i > 0.95 or something along those lines. But thatβs more of a decision (as in decision theory) than an estimation issue I would say.

I hope this was at least a bit helpful. Maybe others have more/different ideasβ¦

Cheers,
Max

1 Like