Variable names with colon problematic for spread_draws

Hi all,
I have a problem using spread_draws() fim tidy_bayes with multilevel models fitted with brms. I get variable names like r_Studienstand:ID__theta[0_1,Intercept] and when I call model_multilvl %>% spread_draws(r_Studienstand:ID__theta[groupAndId]) I get the following error:

Error in r_Studienstand:ID__theta[groupAndId] : NA/NaN argument
In addition: Warning messages:
1: In r_Studienstand:ID__theta[groupAndId] :
  numerical expression has 3 elements: only the first used
2: In r_Studienstand:ID__theta[groupAndId] :
  numerical expression has 3 elements: only the first used

I’m pritty sure the : is the problem because variables without colons are fine and : are used in formulas to specify a range. I also tried model_multilvl %>% spread_draws('r_Studienstand:ID__theta'[groupAndId]) and got:

Error in spec[[2]] : subscript out of bounds

and model_multilvl %>% spread_draws(.$'r_Studienstand:ID__theta'[groupAndId]) yields:

Error in spread_draws_long_(draws, variable_names, dimension_names, regex = regex, : No variables found matching spec: c()[groupAndId]

Is there a way to specify the spread_draws call to work with the variable names or to alter the names in the fitobject?

  • Operating System: Win 10 x64
  • brms Version: 2.14.4

Okay folks. If you fitted a multilevel random effect model in brms and wonder how to get aggregated scores (coef just seems to aggregate over fiexed effect term and one random effect at the time not over all the random effects) you can do it manually as explaind below. (If you know a better way to do this please share your knowledge 'cause I find it kind of cumbersome!) But first here is a sample formula for a single parametric multilevel IRT model (which I’m refering to in the variable names above):

formula_1pl_multilevel <- bf(
  response ~ beta + theta,
  nl = TRUE,
  theta ~ 0 + (1 | groupname/ID),
  beta ~ 1 + (1 | item),
  family = brmsfamily("bernoulli", link = "logit")
)
  1. Extract the values for the upper level:
grouplevel_thetas <- spread_draws(model_multilvl, r_groupname__theta[group]) %>%
	rename(theta_grouplevel = r_groupname__theta)
  1. Extract the values for the lower level:

Because your variable will look like: r_groupname:ID__theta you can’t just call spread_draws(model_multilvl, r_groupname:ID__theta[groupAndID]). Instead you have to use something like this (see here):

var_name <- 'r_groupname:ID__theta'
group_specific_thetas <- spread_draws(model_multilvl, (!!sym(var_name))[groupAndID])
  1. Split the group and ID column:
group_specific_thetas  <- group_specific_thetas %>% 
	separate(groupAndID, into = c('group', 'ID'),  convert = TRUE) %>% 
	rename(theta = r_groupname:ID__theta)

For sure you can combine the two regex mutations but I’m not firm with regex and just didn’t figure out how to do. I set convert = TRUE to cast the group column to integers because I used inegers for group naming. I could have casted the group column in grouplevel_thetas as well. If you use characters for group identification you don’t have to typeconvert at all. ;)

  1. And finally I joined the datasets:
result <- group_specific_thetas  %>% 
	left_join(grouplevel_thetas , by = c('.draw', 'group')) %>% 
	mutate(theta_combined = theta+theta_grouplevel)

I think the trick to get over the colon in the name are backticks: https://stackoverflow.com/questions/36220823/what-do-backticks-do-in-r

It looks like you’re trying to programmatically get at the indices [0_1, Intercept]. My first reaction is to try to parse these things out with a pivot_longer regex from a dataframe: https://tidyr.tidyverse.org/reference/pivot_longer.html (so do a starts_with(`r_groupname:ID__theta`) to get the columns you want)

Annoying, but then you don’t have to deal with !! s and stuff in your code. I’m not sure that is the best way either though.

1 Like