Model sampling using built-in beta_binomial family is far slower than the custom one

Maxime · May 9, 2025, 5:56pm

Hello,

Following on this post that I made a while ago, I have now updated all my libraries and started running my models on the cluster.

@paul.buerkner suggested to:

Install the latest brms version
Use the newly built-in beta_binomial() family instead of the custom one that I implemented following the vignette

Now, here is my problem :

With the custom beta_binomial(), the model is now much slower to fit.
Originally, i used 12 threads per chain and sampled the priors using sample_prior=TRUE, and all my models would finish running within 3 to 5 days with 64Gb of RAM maximum.
Now, my models cannot finish within this period.

I tried reducing the number of threads as it can increase memory consumption, so now the threading is set to 8 and I don’t sample the prior. After close to 48 hours of runtime, the simplest model only reached 100 samples per chain on a total of 1500. My data has around 170K observations. You can see my attached R file.

Any help to understand why it changed would be appreciated.

Thank you!

GAMM-I.R (3.7 KB)

jsocolar · May 9, 2025, 6:11pm

Can you create a reproducible example, with simulated data and a minimally complicated effects structure (i.e. intercept only) that enables us to fit both your custom family and the brms built-in?

Maxime · May 9, 2025, 7:27pm

Thank you for your answer @jsocolar.

Here is a reproducible example that I am running myself at the moment.

# ==========================================================================
# Simulate data with 4 predator_ids and standardized covariates
# ==========================================================================

options(mc.cores = parallel::detectCores())
library(data.table)
library(brms)
library(parallel)

set.seed(42)

# Simulation parameters
n_predators <- 4
n_per_pred <- 250
total_obs <- n_predators * n_per_pred

# Generate predictor variables
sim_data <- rbindlist(lapply(1:n_predators, function(i) {
  predator <- paste0("sim_pred_", i)
  
  cumul_xp_pred <- rnorm(n_per_pred, mean = 10, sd = 5)
  game_duration <- rnorm(n_per_pred, mean = 500, sd = 100)
  prey_avg_speed <- rnorm(n_per_pred, mean = 2, sd = 0.3)
  prey_avg_rank <- rnorm(n_per_pred, mean = 12, sd = 3)
  
  # Generate hunting_success based on cumul_xp_pred
  mu <- 1 / (1 + exp(-0.5 * (cumul_xp_pred - 10) / 5))  # logistic on standardized xp
  hunting_success <- rbinom(n_per_pred, size = 4, prob = pmin(pmax(mu, 0.05), 0.95))
  
  data.table(
    predator_id = predator,
    hunting_success = hunting_success,
    game_duration = game_duration,
    prey_avg_speed = prey_avg_speed,
    cumul_xp_pred = cumul_xp_pred,
    prey_avg_rank = prey_avg_rank
  )
}))

# Standardize predictors
sim_data[, Zgame_duration := scale(game_duration)]
sim_data[, Zprey_avg_speed := scale(prey_avg_speed)]
sim_data[, Zcumul_xp := scale(cumul_xp_pred)]
sim_data[, Zprey_avg_rank := scale(prey_avg_rank)]
sim_data[, predator_id := as.factor(predator_id)]


# ==========================================================================
# Set up model formula and priors
# ==========================================================================

model_formula_builtin <- brmsformula(
  hunting_success | trials(4) ~
    Zcumul_xp +
    (1 | predator_id)
)

model_formula_custom <- brmsformula(
  hunting_success | vint(4) ~
    Zcumul_xp +
    (1 | predator_id)
)

# Priors for complex gamm model
#priors <- c(
#  set_prior("normal(1, 0.5)", class = "b", coef = "Zgame_duration"),
#  set_prior("normal(0, 1)", class = "b", coef = "Zprey_avg_rank"),
#  set_prior("normal(0, 2)", class = "b", coef = "sZcumul_xp_1"),
#  set_prior("normal(0, 0.5)", class = "Intercept"),
#  set_prior("normal(0, 0.5)", class = "sds"),
#  set_prior("normal(log(2), 0.5)", class = "phi")
#)

# Priors for simpler gamm model
#priors <- c(
#  set_prior("normal(0, 2)", class = "b", coef = "sZcumul_xp_1"),
#  # prior on the intercept
#  set_prior("normal(0, 0.5)", class = "Intercept"),
#  # prior on sds parameters
#  set_prior("normal(0, 0.5)", class = "sds"),
#  # priors on phi
#  set_prior("normal(log(2), 0.5)", class = "phi")
#)

# Priors for linear model
priors <- c(
  set_prior("normal(0, 1)", class = "b", coef = "Zcumul_xp"),
  set_prior("normal(0, 0.5)", class = "Intercept"),
  set_prior("normal(0, 0.5)", class = "sd"),
  set_prior("normal(log(2), 0.5)", class = "phi")
)


# ==========================================================================
# Fit model using built-in beta_binomial
# ==========================================================================

fit_builtin <- brm(
  formula = model_formula_builtin,
  family = beta_binomial(),
  data = sim_data,
  warmup = 200,
  iter = 1200,
  thin = 1,
  chains = 4,
  threads = threading(3),
  backend = "cmdstanr",
  seed = 123,
  prior = priors,
  control = list(adapt_delta = 0.99)
)

# ==========================================================================
# Fit model using custom beta_binomial2
# ==========================================================================

beta_binomial2 <- custom_family(
  "beta_binomial2", dpars = c("mu", "phi"),
  links = c("logit", "log"), lb = c(NA, 0),
  type = "int", vars = "vint1[n]"
)

stan_funs <- "
  real beta_binomial2_lpmf(int y, real mu, real phi, int T) {
    return beta_binomial_lpmf(y | T, mu * phi, (1 - mu) * phi);
  }
  int beta_binomial2_rng(real mu, real phi, int T) {
    return beta_binomial_rng(T, mu * phi, (1 - mu) * phi);
  }
"

stanvars <- stanvar(scode = stan_funs, block = "functions")

fit_custom <- brms(
  formula = model_formula_custom,
  family = beta_binomial2,
  stanvars = stanvars,
  data = sim_data,
  warmup = 200,
  iter = 1200,
  chains = 4,
  thin = 1,
  threads = threading(3),
  backend = "cmdstanr",
  seed = 123,
  prior = priors,
  control = list(adapt_delta = 0.99)
)

I am running the test on Pop!OS 22.04 LTS with brms 2.22.0 and R version 4.1.3

Best,

Maxime

Maxime · May 9, 2025, 9:33pm

So I got my answers from this simple test. The results seem to explain my suspicions:

The time to run the model with the built-in family took 3784.3 seconds
The time to run the model with the custom family took 12.1 seconds

Also, a part from the Phi estimate, the coefficients are mostly similar :

r$> summary(fit_builtin)
 Family: beta_binomial 
  Links: mu = logit; phi = identity 
Formula: hunting_success | trials(4) ~ Zcumul_xp + (1 | predator_id) 
   Data: sim_data (Number of observations: 1000) 
  Draws: 4 chains, each with iter = 1200; warmup = 200; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~predator_id (Number of levels: 4) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.07      0.08     0.00     0.28 1.61        7       11

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.00      0.04    -0.09     0.09 1.35      910      831
Zcumul_xp     0.45      0.00     0.44     0.45 1.02      431     2088

Further Distributional Parameters:
    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi    16.22      0.24    15.73    16.71 1.01     1833     1960

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Warning messages:
1: Parts of the model have not converged (some Rhats are > 1.05). Be careful when analysing the results! We recommend running more iterations and/or setting stronger priors. 
2: There were 1 divergent transitions after warmup. Increasing adapt_delta above 0.99 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup

r$> summary(fit_custom)
 Family: beta_binomial2 
  Links: mu = logit; phi = identity 
Formula: hunting_success | vint(4) ~ Zcumul_xp + (1 | predator_id) 
   Data: sim_data (Number of observations: 1000) 
  Draws: 4 chains, each with iter = 1200; warmup = 200; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~predator_id (Number of levels: 4) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.08      0.08     0.00     0.29 1.00     1157     1522

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.00      0.06    -0.12     0.12 1.00     1317     1163
Zcumul_xp     0.43      0.04     0.35     0.51 1.00     3268     2544

Further Distributional Parameters:
    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi     4.78      0.29     4.23     5.36 1.00     2743     2220

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

However, it seems like the model with the built-in family struggles a lot to sample the parameters, as we can see with the very low ESS on the random effects for example, as well as the higher Rhat values.

Any idea of what could be going on? This is a very simple model on 1000 observations

Maxime

jsocolar · May 10, 2025, 1:54am

Using the reprex that you posted on github, I cannot reproduce the behavior locally. When I use 200 warmup iterations as in the reprex, I get some variation in the speed, presumably due to the quality of warmup, but nothing approaching the timings you’ve seen. When I use the default warmup schedule, thing seem a bit more stable, and I see no sign of problems in estimation nor of a big performance hit.

MilaniC · May 10, 2025, 2:07am

Your built_in model and reprex takes only 8 seconds on my M3 MacBook Pro — not your >3000 seconds. You are using very dated versions of both R and brms (my versions: R 4.5.0 and brms 2.22.12 [dev version]). So updating might resolve your issue?

Maxime · May 10, 2025, 10:08am

Hello,

Isn’t the latest version of brms 2.22.0?

I’ll try with a newer R version to see if it helps and post the results.

Best

MilaniC · May 10, 2025, 10:16am

No the latest version of brms is 2.22.12 on Github (development version).

Maxime · May 10, 2025, 4:41pm

I updated to the latest R version first and it did not do the trick. I then installed the latest development version of brms, and now the builtin model runs as fast as the custom one.

Thank you for the answers!

Best

Maxime

Topic		Replies	Views
Brms custom family samples less well than Stan brms rstan , performance , brms	1	48	May 25, 2025
Negative binomial shape and phi parameters brms	14	7522	April 4, 2019
BRM function suddenly stopped running and throws a range of different errors Modeling	26	2141	October 7, 2020
Running time for hierarchical model brms	11	4224	February 12, 2019
Error with custom family Modeling brms	11	662	December 10, 2023

Model sampling using built-in beta_binomial family is far slower than the custom one

Related topics