Finite mixture model for binned (grouped) data

hawre · April 26, 2022, 7:10pm

Hello all,

I was wondering if there is an efficient way to model finite mixture models (e.g., Gaussians) using binned/grouped data? By binned data, I mean (x = 1,2,…etc, y = 15, 169, … etc) where x are the values, and y are the corresponding number of samples for each value. The examples in the manual work great for individual observations where the observations are fed as a single vector rather than aggregated. The problem is that I have a lot of bins and a lot of observations in each bin. Expanding the data will create a very large vector 100K of individual data points and it will be very slow in Stan. I was able to fit my data using the R package mixR (https://cran.r-project.org/web/packages/mixR/vignettes/intro-to-mixr.pdf) which allows for mixture models to be fit using binned data. The computational gain was huge compared to trying to feed the data as a single vector. However, the likelihood used in mixR for the binned data was specific to expectation maximization that the package uses and I was unable to convert it to Stan. Any help will be appreciated as I have been struggling with this for a few days now. I hope this is clear, and I can provide some code if it helps.

Thanks

Bob_Carpenter · April 26, 2022, 9:33pm

If I understand correctly, with x = 1, 2, 5, ... and y = 15, 169, ... then you have 1 observation of value 15, 2 observations of value 169, 5 observations of value 169, etc. This is easy to deal with because you can write down the mixture log likelihood then just multiply it by the number of observations.

Concretely, it looks like this (modulo typos):

data {
  int<lower=0> N;  // observations
  array[N] int x;
  array[N] real y;  // observe y[n] a total of x[n] times
  int<lower=1> K;  // mixture components
}
parameters {
  vector<lower=0>[K] sigma;
  vector[K] mu;
  simplex[K] lambda;  // mixing proportions
}
model {
  // priors
  sigma ~ ...
  mu ~ ...
  lambda ~ ...

  // likelihood
  vector[K] log_lambda = log(lambda);
  for (n in 1:N) {
    vector[K] lp = log_lambda;
    for (k in 1:K) {
      lp[k] += normal_lpdf(y[n] | mu[k], sigma[k]);
    }
    real log_lik_y_n = log_sum_exp(lp);  // log likelihood of single observation of y[n]
    target += x[n] * log_lik_y_n;  // scale to x[n] observations of y[n]
  }
}

Bob_Carpenter · April 26, 2022, 9:37pm

Oops, forgot to answer this. I’m guessing their E-step has the marginalization in the same form I used buried inside somewhere. Usually, you can just translate the E-step of an EM algorithm to define a likelihood in Stan. I did that in the user’s guide for mixture models, LDA, HMM, the CJS model (really just a kind of HMM), data coding models (kinds of mixtures), change point models (latent discrete models, but arguably also a kind of mixture).

That’s what’s going on above. In fact, if you put the model I included into our optimizer, the result is what’s known as “generalized EM”. The generalization is to allow the M-step to hill climb rather than maximize.

hawre · April 26, 2022, 10:07pm

Thank you very much for such quick response! Yes, this is exactly what I was looking for. No worries about the EM, I was just looking for others who have looked into fitting a finite mixture to grouped data. I much rather have it in Stan because I am hoping to incorporate this into a CAR model.

Topic		Replies	Views
Mixture of counts with millions of observations Modeling	5	492	April 9, 2020
How to write a finite discrete mixture model in stan Modeling specification , cognitive-science	11	709	March 29, 2021
Bayesian Mixture Model Modeling pystan , rstan , fitting-issues , stan , mixture	2	553	December 29, 2022
Beta binomial mixture model Modeling mixture	5	1231	March 4, 2021
Continuous(?) mixing model examples Modeling	3	461	June 1, 2022

Finite mixture model for binned (grouped) data

Related Topics