Speed up fitting using y_mean | se(y_se) on mean and SE?

I have a big hierarchical model with crossed random effects which takes 3½ days to run on a dataset with ~ 80.000 observations. I need to do it for three variants of the model, and even worse, I will likely need to do it again for double the number of observations. So I’m looking for ways to increase speed.

Since family = gaussian, I tried computing the mean and SEM (the likelihood parameters) for each cell and replacing brm(y ~ ...) with brm(y_mean | se(y_sem) ~ ...). In a toy example below with 100 data points per cell, this perfectly recovered all parameters and increased total sampling speed from ~600 secs to ~ 20 secs (around 20-fold speedup when factoring in effective sample size). Even for small samples of 8 data points per cell, sampling is speeded up from ~ 20 secs to ~ 7 secs (around 2-fold speedup when factoring in effective sample size).

As far as I can see, the second model only differs from the first in that there is no prior on sigma. I have a fairly weak prior on sigma which is totally overwhelmed by data, so the effects of that “omission” should be negligible. Am I missing other differences here or is this a safe approach?

Toy example to study speedup

First make some data:

library(tidyverse)
library(brms)

# Generate data.
# The predictors: 100 trials for each of 5 conditions within each participant
D_raw = expand.grid(trial = 1:100,
                condition = 1:5,
                id = 1:10) %>%
  # Now the response variable with different intercepts per id and condition
  mutate(
    y = rnorm(n(), mean = condition + id, sd = 1),
    condition = factor(condition),
    id = factor(id)
  )

Fit the model on the raw data:

fit_raw = brm(y ~ condition + (1 + condition | id), D_raw)

… and now fit it using the the faster solution:

# Compute likelihoods per cell
D_lik = D_raw %>%
  group_by(id, condition) %>%
  summarise(y_mean = mean(y),
            y_sem = sd(y) / sqrt(n()))

# Fit model on likelihoods
fit_lik = brm(y_mean | se(y_sem) ~ condition + (1 + condition | id), D_lik)

Compare fits

Here is fit_raw:

Population-Level Effects: 
           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept      6.47      1.13     4.26     8.74        713 1.01
condition2     0.93      0.18     0.59     1.29       4074 1.00
condition3     2.28      0.21     1.88     2.69       3026 1.00
condition4     3.02      0.18     2.66     3.37       3549 1.00
condition5     4.06      0.19     3.69     4.43       3870 1.00

… and here is fit_lik:

Population-Level Effects: 
           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept      6.53      1.24     4.15     9.07        611 1.00
condition2     0.92      0.16     0.62     1.25       2749 1.00
condition3     2.27      0.22     1.83     2.69       2364 1.00
condition4     3.05      0.16     2.74     3.39       2412 1.00
condition5     4.07      0.18     3.73     4.44       2495 1.00

Just one thing after quickly looking at your code. When you just use se() you will not estimate sigma anymore. To make sure, sigma is estimated still, use se(..., sigma = TRUE). Basically, what you are doing is meta-analysis and if its assumptions are justified it is uses all available information for estimating the summary effect size (the effect of condition in your case).

Thank you! This is comforting to hear.

Does the meta-regression approach entail more assumptions than what is already “embodied” in the “raw” hierarchical model? For example, if I use non-normal data, add unmodelled correlations, heterogeneity, etc. I would still expect them to yield the same posteriors (I could be wrong).

Just asking to learn how generic this approach is because I’m considering writing up a blog :-)

As soon as you go beyond a very simple meta-analytic model, things are no longer as easy and I can’t give any general answer to that. I would hope that standard meta-analysis literature/books talks about such issues.

1 Like