Divergent transitions in a latent variable model

I’m looking for advice on how to deal with divergent transitions in a latent variable model (basically a confirmatory factor analytic model using ordered logit). The response variable is a series of 15 survey responses for each person (Likert type 5 point ordered scale). Each Likert response influences one of three latent variables (eta) which are correlated. I’m following the design which is provided by Lee and Song (2012) “Basic and Advanced Bayesian Structural Equation Modeling with Application in the Medical and Behavioral Sciences” where I’m estimating the latent variables explicitly. I’m pretty sure I need to take this approach (please tell me if I’m wrong) because I am trying to jointly estimate a multilevel multinomial logit and latent variable model. I’ve attached a 10% subset of data (for quick testing), R code, and both Stan models. The LatentVariableOnly model is where the problem is, the joint model is just to provide context for my approach.

I few comments about constraints for identifiability of the latent variable model:
(1) The “loading” (lambda) for the first equation in each latent variable is implicitly 1 (hence not there)
(2) Cutpoints (kappa) for the ordered_logit are fixed at both ends based on the response frequency of the first and the cumulative first-fourth categories (as recommended by Lee and Song (2012)).
(3) lambdas are constrained to be positive (otherwise they jump from negative to positive between chains). This is also theoretically motivated as I’ve structured the responses such that no Likert item should have a negative loading.

As to my problem. Sampling looks pretty good in terms of Rhat and traceplots but even after using non-centered parameterization I’m still returning significant numbers of divergent iterations after warmup (5-10%). They all seem to be below the diagonal on the pairs plot, and upping adapt_delta does not resolve the issue (I’ve tried up to 0.999). I know the Stan guidance states that this means I should look to reparameterize my model, but I don’t know where to turn. Any advice on how to figure out where the trouble is and/or how I might go about reparameterizing the model would be greatly appreciated. Also, is there any way to estimate how these divergent transitions are biasing my model?

LatentVariableOnly.stan (1.9 KB)
LatentVariable_MNL.stan (4.1 KB)
other_data.csv (3.5 KB)
LatentVariableModels.R (2.8 KB)
likert_data.csv (4.7 KB)

I had a look at this. I really don’t know about these models, but I did notice that sometimes kappa_free[j, 2] would end up greater than kappa_fixed[j, 2]. To debug that you can add statments like:

if(kappa_free[j, 2] > kappa_fixed[j, 2]) {
  print("Error", kappa[j], kappa_free[j], kappa_fixed[j]);

To the transformed parameters block.

If you replace:

kappa[j,2] = kappa_free[j,1];
kappa[j,3] = kappa_free[j,2];

with (the little 1e-12 nudges are necessary to make sure kappa[j] is ordered with strict inequalities)

kappa[j,2] = kappa_free[j,1] > kappa_fixed[j,1] ? kappa_free[j, 1] : kappa_fixed[j,1] + 1e-12;
kappa[j,3] = kappa_free[j,2] < kappa_fixed[j,2] ? kappa_free[j, 2] : kappa_fixed[j,2] - 1e-12;

and use the run command:

fit.LV <- stan(file = 'LatentVariableOnly.stan',
            data = data,
            init = list(list(kappa_free = init_kappa(kappa_fixed))),
            iter = 1000,
            chains = 1)

then instead of 80~ divergent transitions I get like 5~. I seriously doubt what I wrote there is the correct way to handle this, but hopefully that gives you some ideas about where to look. I do not know why there isn’t an error being thrown by Stan about this (I see checks are in place for this in ordered_logistic_lpmf). Hopefully someone else can comment on that, but play around with how you parameterize that piece of your model and see if you can get anywhere.

Hope that helps!

Thanks very much @bbbales2 . I had no idea Stan wasn’t throwing errors for my ordered constraint. Do you think it is a bug that should be reported? Anyway, I resolved the issue by reparameterizing using a simplex based on this discussion. No more divergent transitions; code now looks like this:

   simplex[2] theta[P];
transformed parameters{
  ordered[2] kappa_free[P];
  ordered[4] kappa[P]; // cutpoints for ordinal class (5 - 1 classes)

  for (j in 1:P){
    kappa_free[j] = kappa_fixed[j,1] + head(cumulative_sum(theta[j]), 2) 
                      * (kappa_fixed[j,2] - kappa_fixed[j,1]);
    kappa[j,1] = kappa_fixed[j,1];
    kappa[j,2] = kappa_free[j,1];
    kappa[j,3] = kappa_free[j,2];
    kappa[j,4] = kappa_fixed[j,2];

I dunno for sure, but it sure seems strange.

Make an issue over here: https://github.com/stan-dev/stan/issues/new then someone can have a look at it and decide whether or not it’s a problem.


I’m not sure what you were expecting for which Stan program If you have a simple reproducible example, then it’d be great if you could include it here or follow @bbbales2’s suggestion and file an issue.

Whatever the case, you don’t want to use constraints in transformed parameters to reject states—you need to build the constraints on the parameters so that any values satisfying the constraints will have finite log density values.

I followed @bbbales2’s guidance and filed an issue which @Bob_Carpenter promptly closed because it had already been fixed in the develop branch. My apologizes for not reporting back here.

1 Like

No worries—we’d rather get duplicate issues than miss something important. We appreciate all the feedback we can get.

1 Like