I’m looking for advice on how to deal with divergent transitions in a latent variable model (basically a confirmatory factor analytic model using ordered logit). The response variable is a series of 15 survey responses for each person (Likert type 5 point ordered scale). Each Likert response influences one of three latent variables (eta) which are correlated. I’m following the design which is provided by Lee and Song (2012) “Basic and Advanced Bayesian Structural Equation Modeling with Application in the Medical and Behavioral Sciences” where I’m estimating the latent variables explicitly. I’m pretty sure I need to take this approach (please tell me if I’m wrong) because I am trying to jointly estimate a multilevel multinomial logit and latent variable model. I’ve attached a 10% subset of data (for quick testing), R code, and both Stan models. The LatentVariableOnly model is where the problem is, the joint model is just to provide context for my approach.

I few comments about constraints for identifiability of the latent variable model:

(1) The “loading” (lambda) for the first equation in each latent variable is implicitly 1 (hence not there)

(2) Cutpoints (kappa) for the ordered_logit are fixed at both ends based on the response frequency of the first and the cumulative first-fourth categories (as recommended by Lee and Song (2012)).

(3) lambdas are constrained to be positive (otherwise they jump from negative to positive between chains). This is also theoretically motivated as I’ve structured the responses such that no Likert item should have a negative loading.

As to my problem. Sampling looks pretty good in terms of Rhat and traceplots but even after using non-centered parameterization I’m still returning significant numbers of divergent iterations after warmup (5-10%). They all seem to be below the diagonal on the pairs plot, and upping adapt_delta does not resolve the issue (I’ve tried up to 0.999). I know the Stan guidance states that this means I should look to reparameterize my model, but I don’t know where to turn. Any advice on how to figure out where the trouble is and/or how I might go about reparameterizing the model would be greatly appreciated. Also, is there any way to estimate how these divergent transitions are biasing my model?

LatentVariableOnly.stan (1.9 KB)

LatentVariable_MNL.stan (4.1 KB)

other_data.csv (3.5 KB)

LatentVariableModels.R (2.8 KB)

likert_data.csv (4.7 KB)