Why is posterior_linpred so slow?


I fitted this model with brms on a large dataset (70 000 observations) :

model_formula <- brmsformula(hunting_success | trials(4) ~
                                        Zspeed +
                                        Zspace_covered_rate +
                                        Zprox_mid_PreyGuarding +
                                        Zhook_start_time +
                                        Zgame_duration +
                                        (1 | map_name) +
                                        (1 | player_id) +
                                        (1 | obs))
base_model <- brm(formula = model_formula,
                  family = binomial(link = "logit"),
                  warmup = 3000, 
                  iter = 11000,
                  thin = 32,
                  chains = 4, 
                  inits = "0", 
                  threads = threading(10),
                  backend = "cmdstanr",
                  seed = 123,
                  prior = priors,
                  control = list(adapt_delta = 0.95),
                  save_pars = save_pars(all = TRUE),
                  sample_prior = TRUE,
                  data = data)

The size of the output is ~1.5 Go. I want to extract the predicted values on the response scale to compute the linear trends for each fixed effects. However, I don’t understand why extracting the draws takes so much time.

For instance I ran this to have the predicted values of the model, but it’s been running for +20 minutes and it is only 30 draws:

draws <- posterior_linpred(base_model,
                            re_formula = NA,
                            transform = FALSE,
                            ndraws = 30,
                            seed = 123)

Am I doing something wrong? Is it because of the overdispersion term? I didn’t have this problem before with an earlier version of brms so I don’t understand what might be going on. Is there a way to extract the predicted values in a faster way so I can easily manipulate them myself?

This is my computer setup :
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
brms_2.16.1, Rcpp_1.0.7

Thank you very much for your help!


1 Like

This might be related to some recent changes in brms that now delegates more stuff to the posterior package. Tagging @paul.buerkner who previously suggested that maybe the relevant code in posterior is slower than we thought.

You can always use the $fit member to access the underlying Stan fit and do whatever summaries you need directly.

Could you please confirm that reverting to an older version of brms improves the performance?


(This post became obsolete after the post before was deleted)

Sorry, Paul!

(Posted something before about a possible bug before realising I had a typo in my code!)

I have been having the same issue after updating brms yesterday.

After reading this thread, I ran posterior_epred on the same model using three different package versions:

  • 2.16.3 : 13 minutes 37 seconds
  • 2.15.0 : 11 minutes 57 seconds
  • 2.14.0 : 11 minutes 49 seconds

Model has an n of about 9000, and I’m predicting over a pretty hefty post-stratification frame.


pp <- posterior_epred(test.mod,
                       ndraws = 20,

Scale that up and the differences start getting quick fast quite quickly, I guess?

Thank you @patrick-eng for providing a reproducible example I did not have the time in the last days! And thank you @martinmodrak for the suggestion of using $fit, I did not know that we could do that.


Yeah, I assume this is because the extraction via posterior is not the quickest yet apparently. Do things get quicker if you install the latest dev version of posterior from github (stan-dev/posterior)?


That does appear to help Paul, yes! :-)