Hi,
I’m not a statistician nor a full time researcher. I’m trying to specify a model that would be helpful in answering whether two categorical predictors (C and T, whit 5 and 3 levels respectively) and/or their interaction can predict a given rating on a scale from 1 to 7 (outcome Y).
I’m planning an experiment that will have repeated measures over T but not C. C will be a between subject factor.
I would like to learn from the data whether a given level of C and T affects the variance of the outcome variable. Also, to what extent is there agreement between the participants on the outcome given a level of C and T.
I’ve read Solom Kurtz brilliant notes on the cumulative probit and the great tutorial by Bürkner and Vuorre.
However, I’m stuck because the model that I believe would answer the questions results in bad Pareto k diagnostic values. I do believe that the effect of the two categorical predictors are not the same across all participants, hence I would like to account for random effects for all predictors by participants (id).
Below is my attempt to a relevant model given my .
priors = c(prior(normal(-1.07, 1), class = Intercept, coef = 1),
prior(normal(-0.57, 1), class = Intercept, coef = 2),
prior(normal(-0.18, 1), class = Intercept, coef = 3),
prior(normal( 0.18, 1), class = Intercept, coef = 4),
prior(normal( 0.57, 1), class = Intercept, coef = 5),
prior(normal( 1.07, 1), class = Intercept, coef = 6),
prior("student_t(3,0, 12)", class = "b"),
prior("lkj_corr_cholesky (2)", class = "cor"),
prior(exponential(1) , class = sd))
fitORML2Interaction <- brm(
formula = Y ~ 1 + C + T + C*T + (C + T + C*T | id) + (1 | Item),
data = result,
family = cumulative("probit"),
prior = priors,
chains = 4, cores = nCores,
seed = 5476,
init_r = 0.2,
control = list(adapt_delta = .99),
file = 'fit_ORML2Interaction'
)
Results in # Warning messages:
# 1: The global prior 'student_t(3, 0, 2.5)' of class 'Intercept' will not be used in the model as all related coefficients have individual priors already. If you did not set those priors yourself, then maybe brms has assigned default priors. See ?set_prior and ?default_prior for more details.
# 2: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#bulk-ess
# 3: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#tail-ess
And
loo(fitORML2Interaction)
# Computed from 4000 by 1800 log-likelihood matrix.
#
# Estimate SE
# elpd_loo -1990.4 20.5
# p_loo 911.5 15.3
# looic 3980.9 41.0
# ------
# MCSE of elpd_loo is NA.
# MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.4]).
#
# Pareto k diagnostic values:
# Count Pct. Min. ESS
# (-Inf, 0.7] (good) 825 45.8% 115
# (0.7, 1] (bad) 857 47.6% <NA>
# (1, Inf) (very bad) 118 6.6% <NA>
Below is the code for data simulation
TA <- lfast(
n = 120,
mean = 6,
sd = 0.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
TB <- lfast(
n = 120,
mean = 6,
sd = 1,
lowerbound = 1,
upperbound = 7,
items = 3
)
TC <- lfast(
n = 120,
mean = 6,
sd = 1.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
C1 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)
TA <- lfast(
n = 120,
mean = 5.5,
sd = 0.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
TB <- lfast(
n = 120,
mean = 5.5,
sd = 1,
lowerbound = 1,
upperbound = 7,
items = 3
)
TC <- lfast(
n = 120,
mean = 5.5,
sd = 1.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
C2 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)
TA <- lfast(
n = 120,
mean = 5.0,
sd = 0.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
TB <- lfast(
n = 120,
mean = 5.0,
sd = 1,
lowerbound = 1,
upperbound = 7,
items = 3
)
TC <- lfast(
n = 120,
mean = 5.0,
sd = 1.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
C3 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)
TA <- lfast(
n = 120,
mean = 4.0,
sd = 0.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
TB <- lfast(
n = 120,
mean = 4.0,
sd = 1,
lowerbound = 1,
upperbound = 7,
items = 3
)
TC <- lfast(
n = 120,
mean = 4.0,
sd = 1.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
C4 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)
TA <- lfast(
n = 120,
mean = 3.0,
sd = 0.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
TB <- lfast(
n = 120,
mean = 3.0,
sd = 1,
lowerbound = 1,
upperbound = 7,
items = 3
)
TC <- lfast(
n = 120,
mean = 3.0,
sd = 1.5,
lowerbound = 1,
upperbound = 7,
items = 3
)
C5 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)
#Convert each fram to long format
CI <- gather(C1, key = "T", value = "Y", c("TA", "TB", "TC" ))
CJ <- gather(C2, key = "T", value = "Y", c("TA", "TB", "TC" ))
CK <- gather(C3, key = "T", value = "Y", c("TA", "TB", "TC" ))
CL <- gather(C4, key = "T", value = "Y", c("TA", "TB", "TC" ))
CM <- gather(C5, key = "T", value = "Y", c("TA", "TB", "TC" ))
#Remove the short frames from the environment
rm(C1, C2, C3, C4, C5)
#Create a factor for C category in each dataframe
C <- rep(c("CI"), len = 360)
CI <- cbind(C, CI)
rm(C)
C <- rep(c("CJ"), len = 360)
CJ <- cbind(C, CJ)
rm(C)
C <- rep(c("CK"), len = 360)
CK <- cbind(C, CK)
rm(C)
C <- rep(c("CL"), len = 360)
CL <- cbind(C, CL)
rm(C)
C <- rep(c("CM"), len = 360)
CM <- cbind(C, CM)
rm(C)
#Create subj id
id1 <- rep(1:120)
id2 <- rep(121:240)
id3 <- rep(241:360)
id <- data.frame(id1, id2, id3)
idL <- gather(id, key = "id", value = "id", (c(id1, id2, id3)))
idL <- subset(idL, select = -c(id))
#Insert subject id
CI <- cbind(idL, CI)
CJ <- cbind(idL, CJ)
CK <- cbind(idL, CK)
CL <- cbind(idL, CL)
CM <- cbind(idL, CM)
rm(id1, id2, id3, id, idL)
#Creating item
Item <- rep(1:6, each = 60, len = 360)
CI <- cbind(Item, CI)
rm(Item)
Item <- rep(7:12, each = 60, len = 360)
CJ <- cbind(Item, CJ)
rm(Item)
Item <- rep(13:18, each = 60, len = 360)
CK <- cbind(Item, CK)
rm(Item)
Item <- rep(19:24, each = 60, len = 360)
CL <- cbind(Item, CL)
rm(Item)
Item <- rep(25:30, each = 60, len = 360)
CM <- cbind(Item, CM)
rm(Item)
#Create one dataframe
result <- rbind(CI, CJ, CK, CL, CM)
rm(CI, CJ, CK, CL, CM)
- Operating System: macOS Sonoma 14.6.1
- brms Version: 2.21.0
Any help on this topic would be much appreciated.
Best,
Dan