Divergent transitions with adapt_delta > .9999 using stan_lmer


Operating System: Window 10
Interface Version: 2.15.3
Compiler/Toolkit: I am not sure.

Hi there,
I am new in using stan_lmer and I am trying to run the following model for a self-paced reading experiment:
There are three level of the variable condition.c, thus I am using a sliding contrast to compared level 1 to level 2, and level 2 to level 3; Code.c is sentence region and has two levels. I want to know how individual differences modulate the differences between those conditions and their interaction conditions.

m1 <- stan_lmer(formula = log(reading_time) ~
(condition.c + code.c + condition.c : code.c)*
(1 + condition.c + code.c + condition.c : code.c| participant) +
(1 + condition.c + code.c + condition.c : code.c| item),
prior_intercept = normal(0, 10),
prior = normal(0, 1),
prior_covariance = decov(regularization = 2),
data = data,
chains = 4,
iter = 2000,
cores = 4,
adapt_delta = .9999)

After running the model with default adapt_delta, I got a message suggesting to increase adapt_delta. I did but I still get the warning and at least 2 divergent transition after warmup.
I looked up some alternative and I found a suggestion that I should reparameterize the model, however, this alternative seems applicable for the stan() function and not for the stan_lmer() or stan_glmer() functions.
I would like to know if there is an alternative to solve this issue and avoid divergent transitions using stan_lmer.

Thank you.


Specify QR = TRUE.


Thanks a lot for your quick reply. I did specify QR = TRUE and maintained the adapta_delta at .999 and I got a similar message. However, when I increased the adapt_delta to .999999 the model presents no more divergent transitions. This was true for the two experiments I am analyzing with this approach.

The problem seems to be solved, but I really would like to know whether scaling up the adapt_delta could be problematic, and also I would like to understand what did the QR option do.

Thanks again for your super valuable help.



Higher adapt_delta is not problematic, but it takes many small steps. Make sure you did not get any warnings telling you to increase max_treedepth. The QR option is documented in ?stan_lmer, but basically it does an orthogonal rotation of the design matrix of the common predictors so it is more plausible that the rotated coefficients are uncorrelated, and then it unrotates the coefficients back to the original units at the end.


Thanks for the clarifications. I did not get any warnings.
I hope this post helps others using the stan_lmer function, as well.
Thanks again.