How to extract number of divergences and adapt_delta from a brms model

blokeman · September 22, 2020, 10:20am

Hi,

I need to perform exact bootstrap validation of my brms categorical model, with 100 bootstrap iterations. The idea is to re-fit the model to 100 bootstrap samples of the data and calculate discrimination accuracy (weighted average of submodel-specific ROC AUCs) for each bootstrap model.

The reason for doing this rather than using something like elpd_loo is that elpd_loo, while quicker to compute, lacks the clear interpretation of ROC AUC. Anyway, I want to be able to reject bootstrap models with divergent transitions and to refit them with a higher adapt_delta until there are no divergences.

To do this, I need to be able to extract the following information from each brms model:

Whether the number of divergent transitions exceeded zero, and
What the adapt_delta setting was.

Can this be done?

torkar · September 22, 2020, 10:49am

Hi

Well, for the first question I guess it’s easy: foo <- get_sampler_params(M$fit) stores params in a list. There you’ll find the column divergent__.

But for #2 I honestly don’t know. You could set it to 0.8 and if there’s a problem, increase to 0.95. But I would strongly advise you to not do this since it’s an indication that the sampler struggles so simply increasing adapt_delta is not a solution in itself.

blokeman · September 23, 2020, 3:41am

Thanks for the first answer.

As for the second, I don’t see why increasing adapt_delta to get rid of divergences would be wrong. That is precisely what the warning message about divergent transitions says to do. If it cannot be done, the implication would be that the model shouldn’t be fit at all, which is not helpful.

bbbales2 · September 23, 2020, 4:39am

It’s a point of debate: Text for warning message

I was talking to someone in a similar situation (they were wondering about iteratively increasing adapt_delta). For them it turned out just running with a higher adapt delta wasn’t much slower than the base adapt delta.

Following up on what @torkar said, there’s a few things you can do to try to figure out what is causing the divergences.

Fit with a smaller amount of data so you can iterate faster. It sounds like you’re doing that already though and you’re getting divergences. Did you get divergences with the full dataset?
Figure out a set of data that gives you divergences. Try tightening your priors with this model or simplifying bits of the model until the divergences go away. If you can find the part of the model that is causing the divergences, maybe there’s a way to fix it.
Simulate small datasets from your model and see if you get divergences fitting it. When you simulate data from your model, just use estimated parameters from a previous fit so you’re in the ballpark of where you think you need to be.

andrewgelman · September 23, 2020, 5:09am

Just to be clear: I’m not morally opposed to increasing adapt_detla. I just think that divergences can be a signal that we can include stronger prior information or use better inits. I think it might lead to trouble if people just do adapt_delta=0.99 out of a feeling that this is the safe option. If the model has problems, the safe option is to improve it (which could involve using more realistic priors).

blokeman · September 23, 2020, 8:50am

The model is weakly identifiable because my analysis is exploratory rather than confirmatory, involving a large number of covariates with significant multicollinearity. Using strong priors might obscure potentially interesting effects whose estimates are uncertain e.g. due to multicollinearity but still worth pointing out for future studies.

Also, Bayesian modeling is alien to most of my audience, so I want to keep the analysis as similar as possible to an equivalent frequentist one. This entails a minimal role for priors. Their only function is to prevent double-digit logits caused by complete separation and to keep the group-level SDs in the same ballpark as we would get with lme4 – hence normal(0,4) and exponential(2), respectively.

Fortunately, preliminary testing suggests that iteratively increasing adapt_delta in cases of divergent transitions is successful at eliminating the divergences.

Topic		Replies	Views
Divergent transition not resolved by increasing delta to 0.9999 Modeling	9	3768	September 9, 2020
Divergencies Modeling	1	666	August 14, 2017
Adapt_delta and chain length Modeling	22	5067	September 11, 2017
Sampler won't start if adapt_delta is used: why? Modeling rstan , fitting-issues , brms	6	1224	September 16, 2021
Automagically increase `adapt_delta` until all divergences are eliminated, what could go wrong? Modeling	17	1166	June 28, 2021

How to extract number of divergences and adapt_delta from a brms model

Related topics