Issue with Model Convergence in brms: Z-Scores for Salmon Abundance Stocks

Hello everyone,

I’m working on a model to calculate the fertility rate of a population of orcas, which is composed of 3 different pods. The goal is to understand whether there’s a relationship between salmon abundances (and their yearly variation) and the population and at pod-level fertility rates.

I’m using brms to fit the model, with z-scores (denoted with the .z suffix) representing the abundances of various salmon stocks. Here’s the model that runs smoothly:

model <- brm(
  Birth ~ (1|Pod) + s(Age) + (1|Animal) + Cor.z:Pod + Lcol.z:Pod + Mcol.z:Pod + Nor.z:Pod + Puso.N.z:Pod +
    Puso.S.z:Pod + Sfb.z:Pod + Wac.z:Pod,# The next stocks break the model: + Sgeo.S.z, #+ Urb.z:Pod +  + Snak.z:Pod + Swvi.z:Pod,
  data=data,family=bernoulli(),           
  
  iter = 8000,
  warmup = 4000,
  chains = 3,
  cores = 3,control = list(adapt_delta = 0.99)
)

This runs quickly, converges well, and all Rhats are close to 1.

However, when I include additional z-scores for the following salmon stocks:

Sgeo.S.z + Urb.z:Pod + Snak.z:Pod + Swvi.z:Pod

The model runs extremely slowly and does not converge, with all Rhats exceeding 1 (as high as 3.62). I tested these additional stocks one by one and in combinations, which is how I confirmed that they are causing the slowdown and convergence issues. Also, I haven’t yet included any specific priors for this model.

The following warnings are generated when I include these additional stocks:

Warning messages:
1: There were 12000 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded 
2: Examine the pairs() plot to diagnose sampling problems
 
3: The largest R-hat is 3.62, indicating chains have not mixed.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#r-hat 
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess 
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess 

I’m attaching the full R code and dataset used in case that helps in troubleshooting. Has anyone encountered similar issues when adding more group-specific slopes for covariates? Any suggestions on how to handle the treedepth or improve convergence would be greatly appreciated. Would including priors help, or should I be looking into adjusting other aspects of the model?

Thanks in advance!
data_brms.csv (308.0 KB)
model_for_Stan_forums.R (534 Bytes)

I don’t know how brms is handling this, but there’s a general problem of non-identifiability that’s exacerbated with more additive random effects. This gets really bad if the effects aren’t themselves identified either with a sum-to-zero or a pin-a-value-to-zero approach.

The problem is even worse with interaction terms since we know they’re introducing heavily correlated predictors.

I would advise adding in the priors and seeing what happens. It won’t guarantee convergence, but it can help if there are stronger priors on the values. There’s an example in the problematic posteriors chapter of the Stan User’s Guide where I go through what happens when trying to fit the fully non-identified model y ~ normal(mu1 + mu2, sigma).

2 Likes