Random Question: Stan Model vs EM Algorithm

Kevin_Li · June 14, 2018, 1:06am

Hi everyone, I’m new here so I’m not too sure how to phrase this question (or where to put it). But I am fitting multiple two component Gaussian mixture models to genetic data. I first used a vanilla EM algorithm (MClust library) and then I mirrored the user manual to write up a similar model in stan. I assumed that the models would be extremely similar, but the Stan model seems to separate the mixtures much more than the EM algorithm (whenever the components are not obvious). I attached my code and a case to illustrate the difference. The graph labeled stan is the stan model and the bottom graph is the generic EM. Does anyone know why this is the case? Improvements on my simple code is also appreciated.

data {
  int<lower = 1> N; //number of genes
  int<lower = 1> J; //number of experiments
  matrix[N,J] y; //data
} parameters {
  simplex[2] theta[N]; //mixing proportions
  ordered[2] mu[N]; //locations of mixture components
  vector<lower = 0>[N] sigma; //sdevs 
} model {
  vector[2] log_theta[N];
  
  sigma ~ lognormal(0,2);
  
  for(n in 1:N) {
    log_theta[n] = log(theta[n]);
    
    mu[n] ~ normal(0, 2);
    
    for(j in 1:J) {
      vector[2] lps = log_theta[n];
      
      for(k in 1:2) {
        lps[k] = lps[k] + normal_lpdf(y[n,j]| mu[n][k], sigma[n]);
      }
      
      target += log_sum_exp(lps);
    }
  }
}

bgoodri · June 14, 2018, 1:12am

If you are using the (default) MCMC algorithm, then it is no surprise that posterior means / medians are different from the deterministic maximum of EM. Also, since Stan’s default MCMC algorithm actually samples the tails of a distribution if all goes well, it is even less of a surprise.

Guido_Biele · June 14, 2018, 8:16am

mclust has many different mixture models. For example the (co)variance of dimensions can be fixed or free across components. Which of the mclust models is your stan code emulating?
Your model seems to assume that the variances for each dimension is constant across mixture components, whereas the covariance between dimensions remains unmodeled.

Bob_Carpenter · June 15, 2018, 6:31pm

It’s really full Bayes vs. point estimation you’re comparing, assuming everything’s working properly. You can see the difference within Stan by using optimization—that should give you the same answer as EM if the EM is working and the models are the same.

Topic		Replies	Views
Mixture model: Different perspective to the schools example Modeling	35	1250	April 2, 2020
Gaussian Mixture Modeling/LPA in Stan? Modeling	2	854	February 20, 2022
Any book/article/web-page/literature for a "really gentle" introduction on "how to fit mixture models with Stan"? Modeling	2	401	January 21, 2019
Difference in variance Gibbs (R,C++) vs HMC (Stan) for simple normal gamma model Modeling	4	1171	August 3, 2018
Tutorial on Monte Carlo EM and variants for MML and MMAP Algorithms	16	3722	October 22, 2018

Random Question: Stan Model vs EM Algorithm

Related topics