Join two cluster in a mixed distribution

Hi,

I implemented a mixed distribution after Mardia and Sutton with a von Mises distribution and a normal distribution.

functions{
  real d_mu_c_helper_fct(real d_mu, real d_sigma, real kappa,
    real rho_1, real rho_2, real angle, real a_mu) {

real d_mu_c = d_mu + d_sigma * sqrt(kappa) 
  * (rho_1 *(cos(angle) - cos(a_mu)) + rho_2*(sin(angle) - sin(a_mu)));

return d_mu_c;
  }
data {
   int         K;  // number classes
   int         N;  // number of all data points
   vector[N]    angle;
   vector[N]    dist; 
}
parameters{
  vector<lower=0, upper=6.28318530718> [K] a_mu;
  vector<lower=0, upper=8>[K]  kappa;
  vector<lower=0, upper=18> [K] d_mu;
  vector<lower=0, upper=6> [K] d_sigma;	

  real<lower=0, upper=1> rho_1;
  real<lower=0, upper=1-rho_1> rho_2;
}
transformed parameters{
  real rho = sqrt(pow(rho_1,2) + pow(rho_2,2)); //eq 1.3

  vector[K] d_sigma_c;
  for (k in 1:K){
    d_sigma_c[k] =  pow(d_sigma[k],2) * (1 - pow(rho,2)); //eq 1.3 
  }
}
model{

  d_mu ~ normal (5, 5);
  kappa ~ normal (5, 5);
  rho_1 ~ normal (1, 1);
  rho_2 ~ normal (1, 1);
  d_sigma ~ normal  (2,2);
  a_mu ~ von_mises (4, 2);

  for (n in 1:N) {
    vector [K] pb;
    vector[K] d_mu_c;
    for (k in 1:K) {
  
      // calculate d_mu_c with the helper function
      d_mu_c[k] = d_mu_c_helper_fct(d_mu[k], d_sigma[k], kappa[k], rho_1, rho_2, angle[n], a_mu[k]);
  
      //calculate real function 
      pb[k] = von_mises_lpdf(angle[n] | a_mu[k], kappa[k]) + normal_lpdf(dist[n] | d_mu_c[k] , d_sigma_c[k]);
    }
    target += log_sum_exp(pb);
   }
}

It works fine but my model eventually always finds two cluster really close together and I don’t know why this happens or if there is a possibility to join them?

stan.o109552_plot.pdf (19.0 KB)

Mixture models are tricky. This sorta things happen with the ones I know of. You can test different numbers of mixture modes and try to choose between them or do some model selection, but it usually doesn’t work out as smoothly as you might hope.

What’re you trying to do with this? Maybe there’s another way to model it.

1 Like

Michael Betancourt wrote a case study on mixture models on our web site (users >> documentation >> case studies) that explains what’s going on here.

There’s also the issue of how to parameterize the variables for von Mises or other circular distributions. If they’re not bunched up near one of the edges, it should be fine. I believe the von Mises implementation we have wraps and it’s up the user program to send it values in a fixed 2 pi range.

Tanks for the idea with the case study, I am trying to get something useful out of it.
I was trying to get cluster/a probabilistic classification of the data, but if this doesn’t work I don’ think I am going to try another way to model it, thanks nevertheless!