Any suggestions to speed up mixture model? (dirichlet process)

I am trying to model Dirichlet process Gaussian mixture model using Stan.
I know that it is impossible, so I tried to mimic Dirichlet process using stick breaking by assuming the maximum number of clusters. I saw similar approach in PyMC3 link
Well, the dirichlet process is not the problem in this topic.
Do you have any suggestions to speed up mixture model?
Below is the stan code. It is simple code. I tried to remove the double for loops but failed.
Thanks in advance!

model {
  real alpha = 1;
  real a=0.001;
  real b=0.001;
  real ps[C];
  sigma_cl ~ inv_gamma(a,b);
  mu_cl ~ normal(0,5);
  v ~ beta(1,alpha);
  for(i in 1:N){
    for(c in 1:C){
    target += log_sum_exp(ps);

You can see the full analysis here

Exchangeable mixture models, such as those arising from truncated Dirichlet process priors, induce non-identified, horrendously multimodal posterior distributions that cannot be accurately fit with any sampler, let alone Stan. This is particularly easy to see by running multiple chains from diffuse initial conditions and see the different modes each discovers. This is also the reason why we cannot perform full Bayesian inference over Latent Dirichlet Allocation and related models.