Bayes factor for heterogeneous residual variances does not converge

I am currently modelling a data set of around 6000 people and need to compare 2 models via Bayes factors.

prior = c(prior(normal(0.6, 0.2), class = Intercept),
            prior(normal(0.0, 0.3), class = "b"))

fit0 <- brm(bf(score ~ group * test_id * gender + (1|school_id) + (1|teacher_id), sigma ~ (1|student_ID)),
  family = skew_normal(),
  data = data,
  prior = prior,
  warmup = 1000,
  iter = 5000,
  chains = 4,
  cores = parallel::detectCores(),
  save_pars = save_pars(all = TRUE),
  control = list(adapt_delta = 0.95, max_treedepth = 15)
)

fit1 <- brm(bf(score ~ group * test_id + gender * test_id + gender * group + (1|school_id) + (1|teacher_id), sigma ~ (1|student_ID)),
  family = skew_normal(),
  data = data,
  prior = prior,
  warmup = 1000,
  iter = 5000,
  chains = 4,
  cores = parallel::detectCores(),
  save_pars = save_pars(all =TRUE),
  control = list(adapt_delta = 0.95, max_treedepth = 15)
)

bayes_factor(fit0, fit1)

It takes quite some while to run and the two models turn out fine. The problem is, that no Bayes factor is computed as the maximum of iterations is exceeded. If I run the above code and drop the heterogeneous assumption on sigma (choosing no distribution), I get a perfectly stable Bayes factor in seconds. I redid this computation already with 20000 and 2000 iterations, but no improvement whatsoever.

I also looked at the bayes_factor documentation online, but I did not find any useful parameter I could change. Do you have any suggestion on computing the Bayes factor with the desired distribution of sigma? Is it because I did not specify any prior distribution for sigma?

Thanks a lot!

You should be setting proper priors on all parameters if you’re going to use bayes factors. See some discussion relevant resources on this thread:

1 Like

Many thanks! How precisely do you set the prior for sigma? Because if I write class equal to sigma I get an error message saying that this is not a prior…

I think you’ll find the function get_prior() really useful here.

When you are just estimating one number for sigma, you can set class="sigma". But when you are estimating how this changes in a distributional regression framework, things change. You instead have an Intercept and beta parameters just like modelling the mean, and you have to specify which parameter your priors are for using the dpar = argument.

Look at this example with the built-in iris dataset

# Loading iris data and 
data(iris)
contrasts(iris$Species) <- contr.sum(3)

# Specifiying distributional model formula
mf <- bf(Sepal.Length ~ Species,
         sigma ~ Species)

# Checking which priors can be set
# Checking which priors can be set
get_prior(mf,
          data = iris,
          skew_normal())


                  prior     class     coef group resp  dpar nlpar lb ub       source
                 (flat)         b                                            default
                 (flat)         b Species1                              (vectorized)
                 (flat)         b Species2                              (vectorized)
 student_t(3, 5.8, 2.5) Intercept                                            default
                 (flat)         b                     sigma                  default
                 (flat)         b Species1            sigma             (vectorized)
                 (flat)         b Species2            sigma             (vectorized)
   student_t(3, 0, 2.5) Intercept                     sigma                  default

Now, you don’t need to set priors on all of these, these are just the options available. I also point out that the default for the skew_normal family is to have sigma on the log scale. You can change this if you wish by having skew_normal(link_sigma="identity").

Also, it’s a good thing to know that class = "Intercept" is not setting a prior on the Intercept as it is generally interpreted, but instead on the centered design matrix. This may or may not be important depending on how balanced your data is. This is in the documentation, but instead I learned it very publicly here:

1 Like