How do I visualize or determine cutoff values from a mixture model?

Hi all,

I am just beginning to work with mixture models. I have serological data that are negative and non-negative MFI values that range from -67.75 to 2759.50 (n = 1469), that are skewed heavily to the left. The values come from two different processes. I would like to estimate the two distributions using a mixture model. The following model runs fine, rhat and ESS values are good.

mix <- mixture(gaussian, gaussian)

prior <- c(prior(normal(0,0.5), Intercept, dpar = mu1),
           prior(normal(0,0.5), Intercept, dpar = mu2))

fit <- brm(NiV ~ 1,
           data = NiV,
           family = mix,
           prior = prior,
           chains = 1, backend = "cmdstanr")

Family: mixture(gaussian, gaussian) 
  Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity 
Formula: NiV ~ 1 
   Data: NiV (Number of observations: 1469) 
  Draws: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 1000

Population-Level Effects: 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept     0.04      0.49    -0.88     0.99 1.01      418      437
mu2_Intercept     1.62      0.43     0.80     2.47 1.00      918      381

Family Specific Parameters: 
       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma1   521.70     36.56   457.91   594.91 1.01      832      645
sigma2    28.18      0.65    26.98    29.45 1.00     1014      863
theta1     0.07      0.01     0.06     0.09 1.00      724      623
theta2     0.93      0.01     0.91     0.94 1.00      723      623

I am at a lost on how to interpret the results. I would like to estimate cutoffs for the original data. For example, is the value 2759.5 in the 80th percentile of mu2? Or what are the cutoffs for mu2 in real terms. To add a little more explanation, mu1 are participants that are negative for a virus and mu2 participants are positive for a virus. I would like to associate the data values to a mu1 or mu2 (negative or positive - high MFI values are positive).

Thanks in advance for your help.
Alan

These results are telling you that the data are being modeled as a mixture of two normals, one with mean near zero and standard deviation near 520, and the other with mean near 1.5 and standard deviation near 28. The first normal accounts for 7% of the total probability mass, and the second normal accounts for 93%. To figure out the probability that an observation with value y comes from the first mixture component, use brms::pp_mixture, which under the hood does something like (the below is just pseudocode):

a = theta1 * normalPDF(y | mu1, sigma1)
b = theta2 * normalPDF(y | mu2, sigma2)
a / (a + b)
1 Like

Thank you so much. The results make a lot more sense now.