[Please include Stan program and accompanying data if possible]

```
##### stan file
stanmodelcode="data {
int<lower=1> N; // Number of points from 2 to track length-1 of the i'th for a cluster.
int<lower=1> K; // number of states
int<lower=1> J; // number of cells in the cluster.
vector[N] angular_change_cluster; // an array containning the angular change of all trajectories.
vector[N] radius_cluster; // an array containning the radius (second-to-the-end) of all trajectories.
vector[N+J] number_of_neighbor_cluster; // we will use it as a covariate to determine two transition matrices.
vector<lower=0>[K] alpha_prior; // dirichlet distribution parameters
vector[K] alpha_P_1[K]; // dirichlet distribution parameters for P_1
vector[K] alpha_P_2[K]; // dirichlet distribution parameters for P_2
int<lower=1> counter_n[J]; // where to start the markovian process
int<lower=1> b_final_f[J]; // where to end the markovian process
int<lower=1> a_initial_f[J]; // first position of each trajectory
}
parameters {
simplex[K] prior_pi; // k is the number of observed models we consider.
simplex[K] P[2, K]; // transition matrix
real<lower=-1.5, upper=1.5> alpha_non_per_radius;
real<lower=0, upper=3.5> beta_non_per_radius;
real<lower=-1.5, upper=1.5> alpha_per_radius;
real<lower=0, upper=3.5> beta_per_radius;
real<lower=-1.5, upper=1.5> alpha_hesi_radius;
real<lower=0, upper=3.5> beta_hesi_radius;
real<lower=0, upper=4> alpha_per_angle;
real<lower=0, upper=10> beta_per_angle;
real<lower=0, upper=10> alpha_non_per_angle;
real<lower=0, upper=4> beta_non_per_angle;
real<lower=0, upper=10> alpha_hesi_angle;
}
model {
matrix[K, K] P_transformed[K]; //saving P as a matrix format.
vector[N+J] prior_pi_alternative[K];
vector[N] prior_pi_modeling[K];
row_vector[N] models[K]; // coordinates
row_vector[K] a;
row_vector[K] b;
int t;
vector[K] transfer;
for(k in 1:K){
target+= dirichlet_lpdf(P[1, k]| alpha_P_1[k]);
}
for(k in 1:K){
target+= dirichlet_lpdf(P[2, k]| alpha_P_2[k]);
}
for(l in 1:2){
for(m in 1:K){
for(n in 1:K){
P_transformed[l, m, n] = P[l, m, n];
}
}
}
target+= dirichlet_lpdf(prior_pi| alpha_prior);
for(j in 1:J){
# target+= normal_lpdf(alpha_r[j]| b_0, sigma_alpha);
# target+= chi_square_lpdf(radius[a_initial_f[j]:b_final_f[j]]| alpha_r[j]);
for(k in 1:K){
prior_pi_alternative[k, a_initial_f[j]] = prior_pi[k];
}
for(l in counter_n[j]: b_final_f[j]){
for(k in 1:K){
b[k] = prior_pi_alternative[k, (l-1)];
}
if(number_of_neighbor_cluster[l-1]<=0){
a = b * P_transformed[1];
}
if(number_of_neighbor_cluster[l-1]>0){
a = b * P_transformed[2];
}
for(k in 1:K){
prior_pi_alternative[k, l] = a[k];
}
}
}
t = 1;
for(j in 1:J){
for(l in counter_n[j]: b_final_f[j]){
for(k in 1:K){
prior_pi_modeling[k, t] = prior_pi_alternative[k, l];
}
t = t+1;
}
}
## Introducing no priors means we consider flat prior for the parameter. Note that the domain of the flat prior comes from the bound we have already defined for parameters.
for(n in 1:N){
transfer[1] = log(prior_pi_modeling[1, n]) + beta_lpdf( (angular_change_cluster[n]/pi())| alpha_non_per_angle, beta_non_per_angle) + lognormal_lpdf(radius_cluster[n] | alpha_non_per_radius, beta_non_per_radius);
transfer[2] = log(prior_pi_modeling[2, n]) + beta_lpdf((angular_change_cluster[n]/pi())| alpha_hesi_angle, alpha_hesi_angle) + lognormal_lpdf(radius_cluster[n] | alpha_hesi_radius, beta_hesi_radius);
transfer[3] = log(prior_pi_modeling[3, n]) + beta_lpdf((angular_change_cluster[n]/pi())| alpha_per_angle, beta_per_angle) + lognormal_lpdf(radius_cluster[n] | alpha_per_radius, beta_per_radius);
target += log_sum_exp( transfer );
}
}"
```

Hello STAN group,

Thanks in advance for your support. I have some sets of data to run the above program with. I already use the following command to run the program

```
fit <- stan(model_code = stanmodelcode, model_name = "example",
data = dat.cluster.1, iter = 200000, chains = 8, verbose = TRUE, control = list(adapt_delta = 0.9999, stepsize = 0.001, max_treedepth =15), cores=8)
```

My problem is the program converges for some of data sets, but diverges for others where I get large rhat for them. My question is whether the increase of iterations can help those data sets converge? Please note that it takes 10 days to get the program run. It might be worth mentioning that I got no error while running the program. Also, I did not play with some parameters like adapt_delta, stepsize, max_treedepth.

Regards,

Elaheh

[edit: escaped program, but didn’t clean up spacing]