Error when trying to compare brmsfit models with loo.brmsfit : Error: Backend 'rstan' is required for this method

Hi,

I fitted 4 different hierarchical models on a large dataset (~70 000 observations). I would like to compare the model performances using LOO-PSIS. I started with a baseline model where I add complexity to the other ones (polynomial terms + interactions).

Here are some specifications of the data:

  • env_ID = factor with 27 levels (environments)
  • ID = factor with 2171 levels (individuals)

Here is the structure of the 1st model. I want to investigate the relationship between different predator hunting behaviours and their hunting success.

# Set priors
priors <- set_prior("normal(0, 5)", class = "b")

# Model formula
model_formula <- brmsformula(hunting_success | trials(4) ~
                                        speed+
                                        space +
                                        ambush +
                                        latency +
                                        (1 | env_ID) +
                                        (1 | ID) +
                                        (1 | obs))

# Baseline model
system.time(base_model <- brm(formula = model_formula,
                              family = binomial(link = "logit"),
                              warmup = 3000, 
                              iter = 53000,
                              thin = 50,
                              chains = 4, 
                              inits = "0", 
                              threads = threading(10),
                              backend = "cmdstanr",
                              seed = 1234,
                              prior = priors,
                              save_pars = save_pars(all = TRUE),
                              control = list(adapt_delta = 0.95),
                              data = data))

The model runs without problems (and the other ones too). I used within-chain parallelization because the models take a very long time to run (on a remote computer cluster with 40 cores). The posterior predictive checks did not seem to indicate a problem with the model fit. However, when I run the loo function on my model, I get this :

> loo1 <- brms::loo(base_model1, "loo")
Warning message:
Found 9925 observations with a pareto_k > 0.7 in model 'base_model1'. It is recommended to set 'moment_match = TRUE' in order to perform moment matching for problematic observations.
> loo1

Computed from 4000 by 70831 log-likelihood matrix

          Estimate    SE
elpd_loo -100031.1 132.2
p_loo      27778.4  86.1
looic     200062.2 264.4
------
Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
                         Count Pct.    Min. n_eff
(-Inf, 0.5]   (good)     28137 39.7%   186
 (0.5, 0.7]   (ok)       32769 46.3%   72
   (0.7, 1]   (bad)       9818 13.9%   16
   (1, Inf)   (very bad)   107  0.2%   6
See help('pareto-k-diagnostic') for details.

I understand that the output indicates that the model might be misspecified. The problem is when I try to use moment_match = TRUE in the loo function, it returns :
Error: Backend ‘rstan’ is required for this method.

Does it mean I would have had to set the backend = “rstan” in the brm function? But then I could not use within-chain parallelization? I could not understand how the problem was fixed reading an answer provided by Paul Buerkner in this post https://discourse.mc-stan.org/t/difficulty-comparing-models-with-loo/17271/22

Unfortunately, I cannot provide the data to reproduce this example as it is private. The models were fitted with brms 2.15.0 under R 4.0.2 on CentOS Linux 7.

Lastly, from what I understood reading this post https://discourse.mc-stan.org/t/difficulty-comparing-models-with-loo/17271/22 and these posts https://avehtari.github.io/modelselection/rats_kcv.html, https://avehtari.github.io/modelselection/roaches.html, should I instead go with k-fold cross validation? Or maybe 14% of observations being problematic is not so important?

Thank you very much for your help.

in the latest brms dev version this should work with the cmdstanr backend as well. if not you may have to refit your model with that version.

Hi Paul, thank you very much for your quick reply.

So from what I understood, I could just import the saved model object in R within my computer, and run loo() on the object (using the latest dev version of brms) and it should work without having to refit the model?

you could try it out at least. I hope it will work.