Brms and emmeans with incl_autocor argument

jkueser · November 6, 2022, 9:34pm

Hello:

I’m using emmeans on a model that includes an ar(1) residual correlation structure. I’m trying to call emmeans with the incl_autocor = F argument - an argument that brms::prepare_predictions can take. From looking at the code, it seems that emmeans calls prepare_predictions and should pass the incl_autocor argument along when called. However, when I try, the following happens…

emm_model_splines_group1 <- emmeans(model_splines_final1, ~ Group,
+                                            incl_autocor = F)
Error in prepare_predictions.brmsfit(object, newdata = newdata, re_formula = re_formula,  : 
  formal argument "incl_autocor" matched by multiple actual arguments

I’m not sure how to deal with this error. Any suggestions? For what it’s worth, using a prior version of emmeans and brms about a year and half ago, this error didn’t happen - though I can’t recall which versions I was using.

Thanks!

jkueser · November 7, 2022, 4:15pm

Here’s a reproducible example showing what’s going on…

> library(emmeans)
> library(brms)
Loading required package: Rcpp
Loading 'brms' package (version 2.18.0).
> library(tidyverse)
> data("LakeHuron")
> LakeHuron <- data.frame(x = as.matrix(LakeHuron), time = as.integer(time(LakeHuron)))
> LakeHuron2 <- rbind(LakeHuron %>% mutate(Region = 'FakeRegion1'), 
+                     LakeHuron %>% mutate(Region = 'FakeRegion2', x = x - 30))
> fit <- brm(x ~ Region + ar(p = 2, time = time, gr = Region), data = LakeHuron2)
Compiling Stan program...
Start sampling
> summary(fit)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: x ~ Region + ar(p = 2, time = time, gr = Region) 
   Data: LakeHuron2 (Number of observations: 196) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Correlation Structures:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
ar[1]     1.07      0.07     0.93     1.20 1.00     2447     2455
ar[2]    -0.25      0.07    -0.39    -0.11 1.00     2389     2419

Population-Level Effects: 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           579.24      0.38   578.54   580.05 1.00     3134     2399
RegionFakeRegion2   -30.01      0.51   -31.01   -28.99 1.00     3252     2472

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.71      0.04     0.64     0.79 1.00     2794     2843

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
> emm1 <- emmeans(fit, ~ Region)
> emm1
 Region      emmean lower.HPD upper.HPD
 FakeRegion1  579.2     578.5       580
 FakeRegion2  549.2     548.4       550

Point estimate displayed: median 
HPD interval probability: 0.95 
> emm2 <- emmeans(fit, ~ Region, incl_autocor = F)
Error in prepare_predictions.brmsfit(object, newdata = newdata, re_formula = re_formula,  : 
  formal argument "incl_autocor" matched by multiple actual arguments

Let me know if anything else would be useful.

Versions…
[1] forcats_0.5.2 stringr_1.4.1 dplyr_1.0.10 purrr_0.3.5 readr_2.1.3 tidyr_1.2.1 tibble_3.1.8
[8] ggplot2_3.3.6 tidyverse_1.3.2 brms_2.18.0 Rcpp_1.0.9 emmeans_1.8.2

paul.buerkner · November 8, 2022, 10:59am

incl_autocor is always FALSE if emmeans is used with brms. So that is why it complains about duplicated arguments.

jkueser · November 8, 2022, 2:44pm

Thanks for the reply!

The reason I was asking about this is that I’ve seen a substantial slow-down in my calls to emmeans functions since brms version 2.14.0 for a large model that includes an ar(1) structure. I thought that the slowdown might have been that the autocorrelation was being included but I guess I was on the wrong track.

Since I have you here, I wonder if you could offer any suggestions for options that might speed up the call to emmeans. Here is a quick example showing what I mean. I created a little dataset complicated enough to bring out the differences between brms versions for a call to an emmeans function (emtrends) - link to model test_model_2022-11-07.rds - Google Drive.

The upshot is that brms v 2.18 takes about twice as long to do the emtrends call as brms v 2.14.0. In my actual model which is much larger than the example below, 2.14.0 is at least 10-20 times faster, making calls to emmeans in 2.18 almost unusable. A quick profile of the calls in each of versions suggested that it might have been due to differences in how the ref_grid is created. It seems that the ref_grid in 2.18 includes the time and grouping factor from the ar(1) structure while brms v 2.14.0 doesn’t.

Do you have any suggestions for how to get the 2.14.0 behavior in 2.18?

Thanks again!

# Fake dataset with trend, grouping factors, largish
x1 <- c(rep('A', 3000), rep('B', 3000))
x2 <- rnorm(6000, mean = 2, sd = 2)
y_orig <- rnorm(6000, mean = 10, sd = 5)
y_mod <- if_else(x1 == 'A', y_orig-x2, y_orig+x2)
gr <- rep(1:60, each = 100)
gr_mod <- rnorm(60, mean = 1, sd = .5) 
gr_mod2 <- rep(gr_mod, each = 100)
t <- rep(1:100, 60)
y_trend <- y_mod+y_mod*t/100
y_final <- y_trend + gr_mod2
df <- data.frame(y = y_final, x1 = x1, x2 = x2, gr = gr, t = t)

fit <- brm(y ~ x1*x2 + (1 | gr) + ar(p = 1, time = t, gr = gr), 
           data = df,
           chains = 4,
           cores = 4,
           file = 'test_model_2022-11-07')
summary(fit)

# system.time(emm1 <- emtrends(fit, ~ x1, var = 'x2'))
emm1 <- emtrends(fit, ~ x1, var = 'x2')
emm1

# Time for brms version 2.14.0: 0.861, 0.650, 0.850 (mean: 0.787)
# Ref grid for brms version 2.14.0
# > ref_grid(fit, ~ x1, var = 'x2')
# 'emmGrid' object with variables:
#   x1 = A, B
# x2 = 1.9879


# Time for brms version 2.18: 1.674, 1.246, 1.363 (mean: 1.43; x1.82 longer)
# Ref grid for brms version 2.18
# > ref_grid(fit, ~ x1, var = 'x2')
# 'emmGrid' object with variables:
#   x1 = A, B
# x2 = 1.9879
# t = 50.5
# gr = 30.5

paul.buerkner · November 8, 2022, 7:27pm

Thanks! Yes, it makes sense why emmeans now takes so much longer due to ref_grid stuff. I will work on a fix. Would you mind opening an issue on paul-buerkner (Paul-Christian Bürkner) · GitHub about this to make it easier for me to keep track of it?

jkueser · November 8, 2022, 8:34pm

Sure, thanks for your help!

Topic		Replies	Views
Argument 'autocor' should be specified within the 'formula' argument brms specification	11	1552	October 25, 2020
Autocor should be specified error, but is specified brms	6	620	February 22, 2021
Using 'cor_brms' objects for 'autocor' is deprecated. Please see ?cor_brms for details Modeling brms	2	432	October 11, 2022
Multiple autocorrelations brms	6	761	May 27, 2020
Prediction using brms-fitted model with residual correlation brms	0	463	October 31, 2022

Brms and emmeans with incl_autocor argument

Related topics