How to continue iterating a model which has not converged

I have a model which has not yet converged. Here is my code:

model_f <- bf(
    log(y) ~ 1 + log(x) + (1 + log(x) | z),
    sigma ~ 1 + (1 | z),
    alpha ~ 1 + (1 | z),
    family = skew_normal()
model <- brm(
    chains = 4, cores = 4,
    data = d,
    iter = 8000, warmup = 7000

There are roughly 16,000 points in my dataset d, and 81 distinct values of z. This model took about 24 hours to run, but has not converged (very low n_eff, largest Rhat > 4, chains not mixed). I’d like to continue iterating this model from the state it is in rather than recompiling from scratch. How do I do this? Is there a way to capture the final state and pass it in as an initial value?

Thank you!

1 Like

Hi, in these types of situations it is highly unlikely that iterating longer will help you much - there is most likely a fundamental problem in your model and/or its (mis)match to your data.

My best guess is that the model is just too complex to be informed by your data. How many datapoinrs do you have? How many replicates for each level of z?

In any case I would start by simplifying the model and trying to find the most complex model that samples well and the simplest model that causes problems. Some of the advice at Divergent transitions - a primer also applies to problems with high rhat.

Best of luck with your model!

1 Like

@martinmodrak is a far better resource than I am, so absolutely look to his posts for guidance here. As he noted, I don’t think that additional iterations will fix the problems and that failure to converge should be treated as a misspecification problem rather than an estimator problem. I just wanted to flesh out some things from your initial post that struck me as unusual.

I believe that, to answer your original question, there isn’t a way to pick up and keep fitting a model that has already been estimated. You can avoid recompiling the model at least by using the update() function.

Do you mind sharing the steps that you’ve taken to get to this point? The fact that you are running 8000 iterations but 7000 of these are warmup is a little odd to me, so it makes me think that you’ve tried various forms of this model.

It’d also be helpful to potentially see what the actual outcome of your model fit is. Are all of the parameters very poorly estimated or are some better than others?

Additionally, is there a reason you’re taking the log of your variables in the formula versus transforming the variables first and fitting them there? I’ve made the mistake of wrapping stuff in log() and then forgetting that I had zero values in my variables, so the model wasn’t being fit on the data I thought it was since the log(0) returns NA. It’s always good to double check that those helper/convenience functions aren’t messing something up.

Regardless, the fact that your outcome is log-transformed suggests to me that it is a strictly positive value, meaning that the skew normal distribution (and its associated distributional parameters) may not be the most appropriate thing to fit. Have you tried other likelihoods?