Truncated DP for p-dimensional Bernoulli mixtures

Hello everyone,

I am fairly new to Stan and currently play around with a mixture model using a truncated DP with a fixed number of components K=5. Every observation \mathbf{x} \in \{ 0, 1\}^p is modelled as

P(\mathbf{x} \mid \dots) = \sum_k^K \pi_k \prod_i^p \text{Bernoulli}(x_i \mid \mu_{k, i}),

where \boldsymbol \pi is a vector of mixing weights and \boldsymbol M a K \times p-dimensional matrix of success-probabilities for the Bernoullis. I am afraid the problem is fairly simple to detect but I cannot spot it. I guess, something with my model specification seems to be flawed, because from 10000 samples I get like 9900 divergent transitions or so. I ordered the matrix of success probabilities and use an inv_logit transform to keep them in bounds, but that didn’t help. Could anyone help please?

Below is the Stan code. The code in transformed parameters is the stick-breaking construction for the mixing weights.

Thank you,
Simon

data {
	int<lower=1> K;
	int<lower=1> n;
	int<lower=1> p;
	real<lower=0> a;
	real<lower=0> b;
	int<lower=0,upper=1> x[n, p];
	real<lower=1> alpha;
}

parameters {    	
  	ordered[p] rates[K];
	real<lower=0, upper=1> nu[K];	
}

transformed parameters {
  simplex[K] pi;
  vector<lower=0, upper=1>[p] prob[K];

  pi[1] = nu[1];
  for(j in 2:(K-1)) 
  {
      pi[j] = nu[j] * (1 - nu[j - 1]) * pi[j - 1] / nu[j - 1]; 
  }
  pi[K] = 1 - sum(pi[1:(K - 1)]);

  for (k in 1:K) 
  {
  	for (ps in 1:p) 
  	{
  		prob[k, ps] = inv_logit(rates[k, ps]);  
  	}
  }   
}

model {
  	real mix[K];
  	nu ~ beta(1, alpha);	
  
  	for(i in 1:n) 
  	{
		for(k in 1:K) 
		{
			mix[k] = log(pi[k]);
			for (ps in 1:p)
			{
				mix[k] += bernoulli_lpmf(x[i, ps] | prob[k, ps]);
			}
		}
		target += log_sum_exp(mix);
  	}
}