Getting the grand mean from fitted()

Apologies in advance because I think this is likely to be user error! I’m trying to extract the grand mean from a model by using ‘NA’ within the newdata argument of fitted(), as described here in the documentation:

NA values within factors in newdata , are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding.
From: https://www.rdocumentation.org/packages/brms/versions/2.10.0/topics/fitted.brmsfit

However, when I use NA within a factor, I’m getting an error:
‘Error in X %*% re$trans.U : non-conformable arguments’

My model is:

brm(
  bf(
    y ~ main + indicator_var * log_inv_logit(interaction),
    main ~ offset(log_dilution) + factor2 -1,
    interaction ~ s(x, bs = 'gp') +
                  s(x, by = factor1, bs = 'gp') +
                  s(factor1, bs = "re") +
                  (1|factor2),
    nl = T
  ),
  data = as.data.frame(data),
  family = poisson(),
  prior = priors,
  iter = 5000, warmup = 2500, chains = 4, cores = 4,
  silent = F
)

Reproducible example below:

# setting up data
factor1 <- rep(c(rep("A",6), rep("B",6), rep("C",6)), 2)

set.seed(42)
data <- data.frame(
  factor1 =  factor(as.character(factor1)),
  factor2 =  factor(as.character(c(seq(1,18)), seq(1,18))),
  indicator_var = c(rep(1, 18), rep(0, 18)),
  log_dilution = c(rep(log(1), 18), rep(log(0.1), 18)),
  x = as.double(c(seq(1,18), seq(1,18))),
  y = c(round(runif(18, min = 0, max = 100)), round(runif(18, min = 10, max = 100)))
  )

# priors 
priors <- c(
    set_prior('normal(0, 100)', class = 'b', nlpar = 'main'),
    set_prior('normal(0, 100)', class = 'b', nlpar = 'interaction')
)

# model  
spline_fit <- brm(
  bf(
    y ~ main + indicator_var * log_inv_logit(interaction),
    main ~ offset(log_dilution) + factor2 -1,
    interaction ~ s(x, bs = 'gp') +
                  s(x, by = factor1, bs = 'gp') +
                  s(factor1, bs = "re") +
                  (1|factor2),
    nl = T
  ),
  data = as.data.frame(data),
  family = poisson(),
  prior = priors,
  iter = 5000, warmup = 2500, chains = 4, cores = 4,
  silent = F
)

# making newdata, removing y and replacing factor1 with NA to get grand mean
my_newdata <- data %>%
  select(-y, -factor1) %>%
  mutate(factor1 = as.factor(NA))

#using fitted
fitted(spline_fit,
       newdata = my_newdata,
       re_formula = NULL,
       nlpar = "interaction",
       summary = T, probs = c(0.05, 0.95),
       allow_new_levels=TRUE)

Without using NA in the new data, I’m getting sensible results for the splines from each group of ‘factor1’. But what I’d also like to extract fitted values for is the “s(x, bs = ‘gp’)” spline. I get the same error with ‘predict’.

  • Operating System: Windows 10
  • brms Version: 2.11.0

Thanks in advance!

This approach only works in standard predictor terms and not in splines. The latter are handled by mgcv whose internal behavior I cannot control.

Ahh okay, that makes sense. I guess this is a question for mgcv then! Thanks very much for your quick reply.

Yes but note that this NA approach only works in brms currently and is no general R thing.

Just a quick update in case others are trying to do the same thing - I’ve somehow managed to do this using NAs in the factor levels to extract the fitted values for this “grand mean” spline.

Instead of replacing the factor with NA (which leaves you with just 1 factor level, NA) you just need to set the level of all the rows to NA but keep the original levels within the factor.

Eg:

my_newdata ← data %>%
mutate(factor1 = NA)

As opposed to:

my_newdata ← data %>%
mutate(factor1 = as.factor(NA))

1 Like