HELP Iris dataset pystan GMM with dimensionality reduction by linear regression - max_treedepth HELP

Hi I’m using the Iris dataset, to try and do inference on the mixtures for each of the 3 species.
I’m using pystan.
They Idea is as followed: the input i the 150x4 matrix, 4d features of the plants. I want to find a 4d beta to get y a 150x1 vector of the new 1d feature, then on this y to do a 3 components gmm.
my stan code is:

data {
  int<lower=0> N; // number of data points
  int<lower=1> K; //initial number of features
  matrix[N,K] x; //original features matrix
  real pmubeta;  // prior mean for beta
  real psbeta;   // prior std for beta
  real psmu; // prior for mu std - for control without recompile model
  real pssigma;// prior for sigma std - for control without recompile model
parameters {
  vector[K] beta;
  ordered[3] mu;
  real<lower=0> sigma[3];
  simplex[3] theta;//prob. of each component of the GMM
model {
  vector[N] y;  
  beta ~ normal(pmubeta, psbeta);
  mu ~ normal(0, psmu);
  sigma ~ lognormal(0, pssigma);
  y = x*beta;
  for(i in 1:N){
      real lp[3];
      for (j in 1:3)
          lp[j] = log(theta[j]) + normal_lpdf(y[i] | mu[j], sigma[j]);
      target += log_sum_exp(lp);

to get the priors for the beta I do PCA and use the mean of the first eigen vector. (for the pmubeta)
for the other priors of psbeta,psmu,pssigma I tried all kind of values.
my Issue is that the fitting/sampling takes for ever and I get this warnings:

WARNING:pystan:n_eff / iter below 0.001 indicates that the effective sample size has likely been overestimated
WARNING:pystan:Rhat above 1.1 or below 0.9 indicates that the chains very likely have not mixed
WARNING:pystan:3902 of 4000 iterations saturated the maximum tree depth of 10 (97.5 %)
WARNING:pystan:Run again with max_treedepth larger than 10 to avoid saturation
WARNING:pystan:Chain 1: E-BFMI = 0.184
WARNING:pystan:Chain 4: E-BFMI = 0.0712
WARNING:pystan:E-BFMI below 0.2 indicates you may need to reparameterize your model

so I guess the problem is with my priors, but I’m not sure…

also I get the following trace plot when using arviz:


I also tried to use init for the mu values to keep the mixture components identifiable

If anyone can help it will be great!

Thanks a lot!

Sorry, it looks your question fell through. Maybe @mike-lawrence is not busy and can asnwer?

OK, so I’ll give it a try… Short on time, so just some hints:

  • It appears that there is some label switching going on, i.e. while mu is ordered, the data still can’t constrain the components, probably because of the additional variability in sigma and/or theta (e.g. in the “red” chain you have low mu[0] with large sigma[0] while other have larger mu and smaller sigma).
  • A possible cause is also that you just have too many components - imagine trying to fit two-component mixture to a single gaussian - you can either have both components have basically the same mu and sigma and completely arbitrary theta or you can get either of the thetas close to 0 and the component can have arbitrary mu and sigma - all of those options would fit the data equally well.
  • Pairs plot (as described e.g. at for R, but I believe it is easy to do in Python) would help in further diagnosis

Does that make sense?