Largest R-hat is NA: What does this mean? and how to get around?

This is my model specification:
model009 = brm(
pairtotal ~ Condition +Trials_new +(Trials_new|Subnum),
family = gaussian(),
data = bloddy1
save_all_pars = TRUE,
sample_prior = TRUE,
chains = 6,
iter = 10000,
seed = 009
I get the following error.I have played around with a number of different iterations.
My data is a table of 4000*30

The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See

What can I do?


  1. Why 6 chains?
  2. Why not use intercepts, i.e., pairtotal ~ 1 + Condition + Trials_new + (1 + Trials_new | Subnum)?
  3. What does the output of rhat(model009) look like?
  4. What’s pairtotal, i.e., can you plot the density and explain a bit what you want to model?
  5. You use default priors? What does get_prior(pairtotal ~ 1 + Condition + Trials_new + (1 + Trials_new | Subnum), data =bloddy1, family = gaussian) look like?

Does brms not add intercepts automatically like lm/lmer?

@mike-lawrence, I’m not 100% sure, but ?brmsformula says,

By default, the population-level intercept (if incorporated) is estimated separately and not as part of population-level parameter vector b.

To me this sounds like it’s not incorporated automatically? :)

1 Like

Thank you for responding:)

  1. 6 because when I digged around this forum for a solution, I came across this number around this R-hat Na issue.
    2.Because as @mike-lawrence mentioned I thought writing 1+… is not explicitly required for brms like in lme4.

Modeling score received for a task over days in two conditions.Looking at the change in these two conditions over the days.

  1. rhat(model009) values between 1 and 0.99 (except for one there is NAN value)

There is an intercept added automatically, but it is by default not part of vector b, and so e.g. has it’s own prior specification. Something to do with how brms does mean centering to improve performance. y ~ x is the same as y ~ 1 + x, and you can turn off the special handeling of the intercept with y ~ 0 + Intercept + x.


Could you explain a bit more on the prior specification part? The vector b is the class intercept? or could you explain that as well… @Ax3man

The Intercept is it’s own class, separate from the other population level effects which are all in class b. You can see that in the image you posted.

This was just to clarify that the formula torkar wrote in his point 2 is indeed identical to your formula.

But perhaps it’s worth pointing you to the section “Parameterization of the population-level intercept” of ?brmsformula, which reads:

… be aware that you are effectively defining a prior on the intercept of the centered design matrix not on the real intercept. You can turn off this special handling of the intercept by setting argument center to FALSE. For more details on setting priors on population-level intercepts, see set_prior.

As well as the referenced ?set_prior and the second paragraph of section 1.

I’m not sure if that means your prior on the Intercept with mean 2 may be causing issues.

What does you outcome look like? Integers?

@torkar yes

Is it possible these warnings can be ignored? – see here Warning messages with `lkj_corr_cholesky` priors It may be worth checking for exactly which parameters these NA values are occurring.

You can also check the fitted model with check_hmc_diagnostics(name_of_your_brms_model$fit). This requires rstan to have been loaded.


One of the randomslope has an NA value .So, should I just ignore that and go ahead with posterior checks?

@Ax3man thank you :) That clears a lot of things.

@ Ax3man
The trials are centered. Could you explain the difference between " 0+ intercept" and ‘y~x’ in this particular context? Is the interpretation of intercept different if I use a centered predictor in the first place?
What difference does it cause to the intercept or better how does the definition change when we have a centered predictor vs non-centered one?

Looks like there are no problems with the model and those warnings should be ignored!

1 Like

I have seen issues before when there is a transformed parameter defined equal to a constant

1 Like