Convergence failure (maybe) in brms


#1

Hi everyone

I’m getting a convergence failure message:

Warning message:
The model has not converged (some Rhats are > 1.1). Do not analyse the results!
We recommend running more iterations and/or setting stronger priors.

But when I look at the Rhats, all are 1.0 and the effective sample size also looks fine. Can anyone help?

Thanks
Ben

fit4a <- brm(formula = DV ~ (1+StypePCA1+ StypeTotal_Active_Freq + StypeTotal_Passive_Freq|Name) + (1+Stype|verb) + StypePCA1+StypeTotal_Active_Freq+StypeTotal_Passive_Freq, data = Kids,
family = bernoulli(link = “logit”),
set_prior(“normal(0,0.72)”, class = “b”),
warmup = 2000, iter = 10000, chains = 1, cores=4, save_all_pars = TRUE, control = list(adapt_delta = 0.99)) # All sentences

fit4a
Family: bernoulli
Links: mu = logit
Formula: DV ~ (1 + Stype * PCA1 + Stype * Total_Active_Freq + Stype * Total_Passive_Freq | Name) + (1 + Stype | verb) + Stype * PCA1 + Stype * Total_Active_Freq + Stype * Total_Passive_Freq
Data: Kids (Number of observations: 2160)
Samples: 1 chains, each with iter = 10000; warmup = 2000; thin = 1;
total post-warmup samples = 8000
ICs: LOO = NA; WAIC = NA; R2 = NA

Group-Level Effects:
~Name (Number of levels: 60)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.54 0.14 0.27 0.81 2963 1.00
sd(StypePASS) 0.95 0.17 0.63 1.30 2126 1.00
sd(PCA1) 0.23 0.10 0.03 0.44 2939 1.00
sd(Total_Active_Freq) 0.14 0.10 0.01 0.38 3196 1.00
sd(Total_Passive_Freq) 0.32 0.22 0.01 0.82 2720 1.00
sd(StypePASS:PCA1) 0.22 0.14 0.01 0.52 2597 1.00
sd(StypePASS:Total_Active_Freq) 0.12 0.10 0.01 0.37 4058 1.00
sd(StypePASS:Total_Passive_Freq) 0.25 0.20 0.01 0.74 4243 1.00
cor(Intercept,StypePASS) -0.43 0.20 -0.74 0.03 2012 1.00
cor(Intercept,PCA1) 0.31 0.28 -0.32 0.76 8000 1.00
cor(StypePASS,PCA1) 0.10 0.28 -0.46 0.61 8000 1.00
cor(Intercept,Total_Active_Freq) -0.05 0.32 -0.64 0.58 8000 1.00
cor(StypePASS,Total_Active_Freq) 0.07 0.30 -0.54 0.63 8000 1.00
cor(PCA1,Total_Active_Freq) 0.16 0.33 -0.52 0.73 8000 1.00
cor(Intercept,Total_Passive_Freq) 0.00 0.31 -0.59 0.62 8000 1.00
cor(StypePASS,Total_Passive_Freq) 0.10 0.31 -0.53 0.65 8000 1.00
cor(PCA1,Total_Passive_Freq) 0.20 0.33 -0.49 0.75 8000 1.00
cor(Total_Active_Freq,Total_Passive_Freq) -0.08 0.34 -0.70 0.58 8000 1.00
cor(Intercept,StypePASS:PCA1) 0.04 0.31 -0.57 0.63 8000 1.00
cor(StypePASS,StypePASS:PCA1) 0.12 0.30 -0.48 0.67 8000 1.00
cor(PCA1,StypePASS:PCA1) -0.08 0.33 -0.68 0.58 8000 1.00
cor(Total_Active_Freq,StypePASS:PCA1) 0.01 0.33 -0.61 0.63 6503 1.00
cor(Total_Passive_Freq,StypePASS:PCA1) 0.03 0.33 -0.60 0.65 6116 1.00
cor(Intercept,StypePASS:Total_Active_Freq) -0.02 0.32 -0.62 0.58 8000 1.00
cor(StypePASS,StypePASS:Total_Active_Freq) -0.07 0.32 -0.66 0.56 8000 1.00
cor(PCA1,StypePASS:Total_Active_Freq) -0.07 0.33 -0.67 0.58 8000 1.00
cor(Total_Active_Freq,StypePASS:Total_Active_Freq) -0.11 0.34 -0.71 0.56 8000 1.00
cor(Total_Passive_Freq,StypePASS:Total_Active_Freq) -0.11 0.34 -0.72 0.57 8000 1.00
cor(StypePASS:PCA1,StypePASS:Total_Active_Freq) 0.01 0.33 -0.63 0.64 8000 1.00
cor(Intercept,StypePASS:Total_Passive_Freq) 0.01 0.33 -0.62 0.63 8000 1.00
cor(StypePASS,StypePASS:Total_Passive_Freq) -0.04 0.33 -0.64 0.59 8000 1.00
cor(PCA1,StypePASS:Total_Passive_Freq) -0.04 0.33 -0.66 0.61 8000 1.00
cor(Total_Active_Freq,StypePASS:Total_Passive_Freq) -0.09 0.34 -0.71 0.59 8000 1.00
cor(Total_Passive_Freq,StypePASS:Total_Passive_Freq) -0.10 0.35 -0.72 0.58 8000 1.00
cor(StypePASS:PCA1,StypePASS:Total_Passive_Freq) 0.02 0.33 -0.62 0.65 8000 1.00
cor(StypePASS:Total_Active_Freq,StypePASS:Total_Passive_Freq) -0.08 0.35 -0.69 0.60 5588 1.00

~verb (Number of levels: 72)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.59 0.13 0.35 0.85 3550 1.00
sd(StypePASS) 0.84 0.17 0.53 1.18 3241 1.00
cor(Intercept,StypePASS) -0.95 0.06 -1.00 -0.81 3638 1.00

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 1.41 0.14 1.15 1.70 8000 1.00
StypePASS -1.37 0.20 -1.76 -0.98 8000 1.00
PCA1 0.32 0.11 0.09 0.54 8000 1.00
Total_Active_Freq 0.02 0.24 -0.44 0.50 8000 1.00
Total_Passive_Freq 0.16 0.51 -0.85 1.16 8000 1.00
StypePASS:PCA1 -0.05 0.15 -0.35 0.24 8000 1.00
StypePASS:Total_Active_Freq -0.05 0.28 -0.58 0.49 8000 1.00
StypePASS:Total_Passive_Freq -0.06 0.58 -1.20 1.08 8000 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).


#2

It is probably the Rhat of the lp__


#3

Thanks! So what’s the solution - just running it for longer?

Thanks
Ben


#4

I would run more chains before running the existing chains longer, but I don’t pay much attention to Rhat anyways, especially for lp__.


#5

Yes, do this. With 1 chain Rhat doesn’t work well. Run at least 4 chains.

You should. Also for lp__


#6

You may want to look at rhat(fit4a) to see which parameters have high Rhat.


#7

To bring Ben and Aki’s comments into context – convergence diagnostics like \hat{R} consider the marginal behavior of your chains, and that can be very different for different variables. Some variables converge quickly, and their expectation values can be estimated quickly, while some converge more slowly and require longer running times.

If \hat{R} is close enough to one for all of your variables except for lp__ then the expectation value estimates for those variables are probably okay, especially if none of the other diagnostics are indicating problems.
But it doesn’t mean that the estimates for expectations of any function of those variables will be okay! Variables can correlate with each other in ways that makes the convergence of a function worse and hence the estimate untrustworthy.

lp__ tends to be extremely sensitive to the autocorrelation of the Markov chain and hence provides a reasonable bound on how well any function of the variables will converge. In other words, ensuring that the diagnostics for lp__ are good gives you the strongest evidence that your fit is okay but if you focus only on a few variables and carefully check the diagnostics for those variables then you may be able to ignore lp__ for that very specific context.


#8

Thanks so much everyone! OK so I reran it with 4 chains, 10,000 iterations and adapt delta=0.99, but still get the same problem. Looking at the rhats now, none are above 1.001 (many are 0.999), but the following are all NaN, which presumably suggests some problem with the model. Does anyone have any idea what that might be?

Thanks
Ben

`L_1[1,1]` L_1[1,2]
`L_1[1,3]` L_1[1,4]
`L_1[1,5]` L_1[1,6]
`L_1[1,7]` L_1[1,8]
`L_1[2,3]` L_1[2,4]
`L_1[2,5]` L_1[2,6]
`L_1[2,7]` L_1[2,8]
`L_1[3,4]` L_1[3,5]
`L_1[3,6]` L_1[3,7]
`L_1[3,8]` L_1[4,5]
`L_1[4,6]` L_1[4,7]
`L_1[4,8]` L_1[5,6]
`L_1[5,7]` L_1[5,8]
`L_1[6,7]` L_1[6,8]
`L_1[7,8]` L_2[1,1]
$L_2[1,2]


#9

The upper triangular components of a Cholesky factor are constant. This registers as having zero empirical variance which causes the \hat{R} calculation to explode into a NaN. You can safely ignore those NaNs.