I am modelling what is, in my opinion, a complicated data set. I have data for 28 participants who rated “arousal” and “valence” using a pictographic (9-point) scale for each construct under 7 conditions: auditory only, audio-visual, low-pass filtered auditory only (half of the subjects (n = 14)), low-pass filtered auditory-visual (n = 14) high-pass filtered auditory only (half of the subjects (n = 14)), high-pass filtered auditory-visual (n = 14), and visual only (filtering not applicable). The stimuli consist of 65 “tokens” that have been categorized into “unpleasant”, “neutral”, “pleasant” - the number of tokens in each category is unbalanced.
So, it is a nested design, with half of the participants split on one sub-condition, and a bivariate response
I am trying to model the data using Dr. Buerkner’s fantastic multivariate tutorial, but I am unsure if I’m approaching the model correctly.
I have uploaded the data here:
emo_modality_with_all_tokens.csv (609.3 KB)
I am preparing the data as follows, with a bit of a cheat. I am combining ordinal ratings from 1-4 as "low, 5 as “neutral”, and 6-9 as “high”. I have some idea of how to do this in combination in Stan as a “Generated Quantity” by summing the probabilities, but I am not sure if that’s possible in brms
, is it possible to pass custom generated quantities through brms
(i.e., sum the probabilities)?
I instead re-coded the data before analyzing:
## List required packages ====
Pkgs <- c("tidyverse", "magrittr", "brms",
"rstan", "parallel", "tidybayes",
"modelr")
### Load packages ====
lapply(Pkgs, require, c = T)
## Set computing options ====
ncores = detectCores()
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
## Read in and wrangle data
Modality_Ord <- read.csv("emo_modality_with_all_tokens.csv", stringsAsFactors = FALSE) %>%
select("Subject" = ss_id, "Group" = filtering_group, "Filter" = filter_type,
"Condition" = condition, "Category" = category_label,
"Token" = token, "Valence" = valence, "Arousal" = arousal) %>%
mutate( Category = recode_factor(Category,
unpleasant = "Unpleasant",
neutral = "Neutral",
pleasant = "Pleasant")) %>%
droplevels() %>%
mutate(Condition = ifelse(Condition == "fAO" | Condition == "fAV",
paste(Condition, Filter, sep = "_"), Condition),
Subject = factor(Subject),
Condition = factor(Condition,
levels = c("AO", "fAO_LP", "fAO_HP",
"AV", "fAV_LP", "fAV_HP",
"VO")),
Arousal2 = case_when(
Arousal <= 4 ~ "Low",
Arousal == 5 ~ "Neutral",
Arousal >= 6 ~ "High"),
Valence2 = case_when(
Valence <= 4 ~ "Low",
Valence == 5 ~ "Neutral",
Valence >= 6 ~ "High")
) %>%
mutate(Arousal2 = factor(Arousal2,
levels = c("Low", "Neutral", "High"), ordered = TRUE),
Valence2 = factor(Valence2,
levels = c("Low", "Neutral", "High"), ordered = TRUE),
Token = factor(Token)) %>%
select(-Group, -Filter, -Valence, -Arousal) %>%
ungroup()
Fitted with:
# Model formula
Modality_ord_fmla3 <- bf(mvbind(Arousal2, Valence2) ~ Category +
Condition +
Category:Condition +
(1|p|Token) +
(1|q|Category:Token) +
(1|r|Subject) +
(1|s|Condition:Subject))
# Priors
modality_ord_priors_short3 <- c(
set_prior("normal(0,2)",
class = "b"),
set_prior("normal(-2,0.5)",
class = "Intercept",
coef = "1",
resp = "Arousal2"),
set_prior("normal(-1,0.5)",
class = "Intercept",
coef = "2",
resp = "Arousal2"),
set_prior("normal(-2,0.5)",
class = "Intercept",
coef = "1",
resp = "Valence2"),
set_prior("normal(-1,0.5)",
class = "Intercept",
coef = "2",
resp = "Valence2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Subject",
resp = "Arousal2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Condition:Subject",
resp = "Arousal2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Token",
resp = "Arousal2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Category:Token",
resp = "Arousal2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Subject",
resp = "Valence2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Condition:Subject",
resp = "Valence2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Token",
resp = "Valence2"),
set_prior("normal(0,0.125)",
class = "sd",
coef = "Intercept",
group = "Category:Token",
resp = "Valence2")
)
# Compile model
Modality_BRMS_Ord_MV_Mod_lite <-
brm(Modality_ord_fmla3,
data = Modality_Ord,
family = cumulative("logit"),
prior = modality_ord_priors_short3,
inits = 0,
iter = 5000,
warmup = 2500,
chains = 4,
cores = ncores,
sample_prior = TRUE,
control = list(adapt_delta = 0.99,
max_treedepth = 12)
)
But, when I try to add predicted draws to the data:
Modality_Ord %>%
select(Subject, Category, Condition, Token) %>%
distinct() %>%
add_fitted_draws(
Modality_BRMS_Ord_MV_Mod_lite,
n = 5,
allow_new_levels = TRUE)
My RAM (32gb) fills and then my swap (32gb) also fills and my R session crashes.
Is this just a hardware issue (not enough memory) or am I inefficiently trying to get predictions?