Multiple Chains do not converge

I am new to Stan (accessing through RStan) and am trying to fit a latent variable model to ordered data. This is the model I am using to fit it:

  int<lower=0> n; // number of data points (4163)
  int<lower=0> j; // number of item parameters (5)
  int y1[n]; 
  int y2[n]; 
  int y3[n]; 
  int y4[n]; 
  int y5[n]; 
  int<lower=2> k2; 
  int<lower=2> k3; 
  int<lower=2> k4; 
  int<lower=2> k5; 

  vector[j] alpha; // item difficulty parameters
  real<lower=0> beta[j]; // item discrimination parameters
  vector[n] theta; // latent variables
  ordered[k2-1] c2; 
  ordered[k3-1] c3; 
  ordered[k4-1] c4; 
  ordered[k5-1] c5; 

  theta ~ normal(0, 1); // fixing latent variable scale
  alpha ~ normal(0, 10); // prior on item difficulty
  beta ~ gamma(4, 3); // prior on item discrimination
  y1 ~ bernoulli_logit(alpha[1] + beta[1] * theta); 
  y2 ~ ordered_logistic(alpha[2] + beta[2] * theta, c2); 
  y3 ~ ordered_logistic(alpha[3] + beta[3] * theta, c3); 
  y4 ~ ordered_logistic(alpha[4] + beta[4] * theta, c4); 
  y5 ~ ordered_logistic(alpha[5] + beta[5] * theta, c5); 

When I run the model using one Markov chain, I achieve convergence, but all iterations exceed maximum tree depth. Trying to rerun the model adjusting maximum tree depth to 15 crashes my R session and does not go beyond sampling of 0%. Using two Markov chains, all else unchanged, the model fits with no divergent transitions, but there is no sign of convergence (Rhats are very high, and traceplots are fairly straight lines and the two chains do not overlap). The only warning message I get upon fitting is again maximum tree depth, but when reading the brief warning description it seems that this issue is related to efficiency of sampling rather than convergence. Is there something that I can do in order to get the model to converge on multiple chains?

Justin M.

Hitting maximum treedepth means that the Hamiltonian integrator is having to integrate a long time before it hits a U-turn (and generates a new MCMC draw).

Two common things that can cause this are unidentifiabilities in the model or situations where the timestep has been driven very low to keep from having divergences.

Since multiple chains don’t appear to be doing the same thing, I’m guessing there might also be an unidentifiability here.

These difficulties could be coming from your data, or they could be intrinsic to the model. The way to figure this out is simulate data from your model and then estimate it. If this fails, you’ll need to change something with your model – either use a different parameterization or tighten priors or something.

Simulated data is probably the place to start.