Rhat is 1.01 for one parameter out of 150. Is this bad?

Windows 10, brms version 2.15.0

Fitting a categorical model here, with 4 outcome categories, 29 population-level predictors (of which several have multiple df), 2 group-level effects (varying intercepts), with a total of 150 parameters to estimate.

Brms outputs these annoying little warnings:

Specifying global priors for regression coefficients in categorical models is deprecated. Please specify priors separately for each response category.

From an earlier exchange with @paul.buerkner I assume that this is harmless. The other warning is:

The global prior 'normal(0, 2.5) will not be used in the model as all related coefficients have individual priors already. If you did not set those priors yourself, then maybe brms has assigned default priors. See ?set_prior and ?get_prior for more details.

I assume that this is also harmless. I’ve pre-defined my priors by first creating a table using mypriors <- get_prior(modelformula, family = categorical, data = mydata) and then setting common ‘normal(0, 2.5)’ priors for everything in the classes ‘b’ and ‘Intercept’ as well as setting common ‘exponential(2)’ priors for everything in the ‘sd’ class. Only for two coefficients in the ‘b’ class did I set individual priors.

Brms is brilliant software, but if both of these warnings are harmless then it would be nicer not to get them — a serious warning (such as one about divergent transitions) could easily be missed due to the flood of irrelevant ones.

With the whining thus over with, here’s my question: to my delight there was no notification of divergent transitions. However, one (1) of my 150 parameters has an Rhat of 1.01 (all the rest have 1.00). Are the results therefore unreliable? If they are, how might I avoid the problem upon refitting? Would increasing adapt_delta help? Its present value is 0.95.

The model consists of 2500*4 post-warmup samples for a total of 10 000, and I’m not keen to increase this because even with just 15 000, the model objects (and loo posterior objects) become too large for my laptop to handle without running out of memory.

1 Like

I think there’s some variation of opinion here about small R-hats greater than one, but I don’t think that anybody would fault you from proceeding forward with your inference with a single R-hat of 1.01.

With that said, it’s worth remembering that convergence diagnostics never give you a strong guarantee of convergence—they just provide imperfect tests to flag clear cases of non-convergence. If millions of lives are hanging in the balance here, you might consider running more iterations to assess convergence, and then thinning prior to downstream analysis. Even more importantly, you might consider running more chains to improve your chances of detecting multimodality. Finally, you might consider performing simulation-based calibration (SBC) to build further confidence that your model provides accurate inference.

Increasing adapt_delta won’t help here (technically it should increase the ESS per iteration, but the effect would be marginal).

As an aside, elsewhere in the Bayesian world you’ll routinely find people claiming “convergence” with R-hat much higher than 1.01, despite that they’re using a substantially less sensitive version of R-hat, and they’re using algorithms that don’t provide advanced diagnostics like divergences. This state of affairs certainly doesn’t excuse Stan users from thinking carefully about the possibility of non-convergence in models with R-hats in excess of 1.00, but I do think it’s useful context. I recently reviewed a paper that claimed to assess convergence based on all r-hats less than 1.2!

6 Likes

This seems like one of those issues that shouldn’t even make it past an editor’s desk.

There’s some disagreement about how close to 1 \hat{R} needs to be but fairly general agreement that anything in excess of 1.1 warrants further inspection. TBH, as long as the highest value is less than 1.05 and the vast majority of the parameters are below that threshold, there’s probably nothing to worry about assuming other diagnostics don’t indicate further issues and posterior predictive checks look good.

3 Likes

Ther is some inherent stochasticity in rhat, which mostly depends on the number of iterations you run.

To summarize the IMHO most important part from a thread on the topic Summarising Rhat values over multiple variables/fits

Best of luck with your model!

4 Likes