Ordinal probit multilevel model with random effects

Hi,

I’m not a statistician nor a full time researcher. I’m trying to specify a model that would be helpful in answering whether two categorical predictors (C and T, whit 5 and 3 levels respectively) and/or their interaction can predict a given rating on a scale from 1 to 7 (outcome Y).

I’m planning an experiment that will have repeated measures over T but not C. C will be a between subject factor.

I would like to learn from the data whether a given level of C and T affects the variance of the outcome variable. Also, to what extent is there agreement between the participants on the outcome given a level of C and T.

I’ve read Solom Kurtz brilliant notes on the cumulative probit and the great tutorial by Bürkner and Vuorre.

However, I’m stuck because the model that I believe would answer the questions results in bad Pareto k diagnostic values. I do believe that the effect of the two categorical predictors are not the same across all participants, hence I would like to account for random effects for all predictors by participants (id).

Below is my attempt to a relevant model given my .

priors = c(prior(normal(-1.07, 1), class = Intercept, coef = 1),
            prior(normal(-0.57, 1), class = Intercept, coef = 2),
            prior(normal(-0.18, 1), class = Intercept, coef = 3),
            prior(normal( 0.18, 1), class = Intercept, coef = 4),
            prior(normal( 0.57, 1), class = Intercept, coef = 5),
            prior(normal( 1.07, 1), class = Intercept, coef = 6),
            prior("student_t(3,0, 12)", class = "b"),
            prior("lkj_corr_cholesky (2)", class = "cor"),
            prior(exponential(1) , class = sd))

fitORML2Interaction <- brm(
  formula = Y ~ 1 + C + T + C*T + (C + T + C*T | id) + (1 | Item), 
  data = result, 
  family = cumulative("probit"),
  prior = priors,
  chains = 4, cores = nCores,
  seed = 5476,
  init_r = 0.2,
  control = list(adapt_delta = .99),
  file = 'fit_ORML2Interaction'
)
Results in # Warning messages:
# 1: The global prior 'student_t(3, 0, 2.5)' of class 'Intercept' will not be used in the model as all related coefficients have individual priors already. If you did not set those priors yourself, then maybe brms has assigned default priors. See ?set_prior and ?default_prior for more details. 
# 2: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#bulk-ess 
# 3: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
# Running the chains for more iterations may help. See
# https://mc-stan.org/misc/warnings.html#tail-ess 

And

loo(fitORML2Interaction)
# Computed from 4000 by 1800 log-likelihood matrix.
# 
# Estimate   SE
# elpd_loo  -1990.4 20.5
# p_loo       911.5 15.3
# looic      3980.9 41.0
# ------
#   MCSE of elpd_loo is NA.
# MCSE and ESS estimates assume MCMC draws (r_eff in [0.5, 1.4]).
# 
# Pareto k diagnostic values:
#   Count Pct.    Min. ESS
# (-Inf, 0.7]   (good)     825   45.8%   115     
# (0.7, 1]   (bad)      857   47.6%   <NA>    
#   (1, Inf)   (very bad) 118    6.6%   <NA>

Below is the code for data simulation


TA <- lfast(
  n = 120,
  mean = 6,
  sd = 0.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TB <- lfast(
  n = 120,
  mean = 6,
  sd = 1,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TC <- lfast(
  n = 120,
  mean = 6,
  sd = 1.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

C1 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)


TA <- lfast(
  n = 120,
  mean = 5.5,
  sd = 0.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TB <- lfast(
  n = 120,
  mean = 5.5,
  sd = 1,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TC <- lfast(
  n = 120,
  mean = 5.5,
  sd = 1.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

C2 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)

TA <- lfast(
  n = 120,
  mean = 5.0,
  sd = 0.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TB <- lfast(
  n = 120,
  mean = 5.0,
  sd = 1,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TC <- lfast(
  n = 120,
  mean = 5.0,
  sd = 1.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

C3 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)

TA <- lfast(
  n = 120,
  mean = 4.0,
  sd = 0.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TB <- lfast(
  n = 120,
  mean = 4.0,
  sd = 1,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TC <- lfast(
  n = 120,
  mean = 4.0,
  sd = 1.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

C4 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)

TA <- lfast(
  n = 120,
  mean = 3.0,
  sd = 0.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TB <- lfast(
  n = 120,
  mean = 3.0,
  sd = 1,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

TC <- lfast(
  n = 120,
  mean = 3.0,
  sd = 1.5,
  lowerbound = 1,
  upperbound = 7,
  items = 3
)

C5 <- data.frame(TA, TB, TC)
rm(TA, TB, TC)

#Convert each fram to long format
CI <- gather(C1, key = "T", value = "Y", c("TA", "TB", "TC" ))
CJ <- gather(C2, key = "T", value = "Y", c("TA", "TB", "TC" ))
CK <- gather(C3, key = "T", value = "Y", c("TA", "TB", "TC" ))
CL <- gather(C4, key = "T", value = "Y", c("TA", "TB", "TC" ))
CM <- gather(C5, key = "T", value = "Y", c("TA", "TB", "TC" ))

#Remove the short frames from the environment
rm(C1, C2, C3, C4, C5)

#Create a factor for C category in each dataframe

C <- rep(c("CI"), len = 360)
CI <- cbind(C, CI)
rm(C)

C <- rep(c("CJ"), len = 360)
CJ <- cbind(C, CJ)
rm(C)

C <- rep(c("CK"), len = 360)
CK <- cbind(C, CK)
rm(C)

C <- rep(c("CL"), len = 360)
CL <- cbind(C, CL)
rm(C)

C <- rep(c("CM"), len = 360)
CM <- cbind(C, CM)
rm(C)


#Create subj id
id1 <- rep(1:120)
id2 <- rep(121:240)
id3 <- rep(241:360)

id <- data.frame(id1, id2, id3)

idL <- gather(id, key = "id", value = "id", (c(id1, id2, id3)))

idL <- subset(idL, select = -c(id))

#Insert subject id
CI <- cbind(idL, CI)
CJ <- cbind(idL, CJ)
CK <- cbind(idL, CK)
CL <- cbind(idL, CL)
CM <- cbind(idL, CM)

rm(id1, id2, id3, id, idL)

#Creating item
Item <- rep(1:6, each = 60, len = 360)
CI <- cbind(Item, CI)
rm(Item)

Item <- rep(7:12, each = 60, len = 360)
CJ <- cbind(Item, CJ)
rm(Item)

Item <- rep(13:18, each = 60, len = 360)
CK <- cbind(Item, CK)
rm(Item)

Item <- rep(19:24, each = 60, len = 360)
CL <- cbind(Item, CL)
rm(Item)

Item <- rep(25:30, each = 60, len = 360)
CM <- cbind(Item, CM)
rm(Item)

#Create one dataframe
result <- rbind(CI, CJ, CK, CL, CM)

rm(CI, CJ, CK, CL, CM)
  • Operating System: macOS Sonoma 14.6.1
  • brms Version: 2.21.0

Any help on this topic would be much appreciated.

Best,
Dan

Also, it would be extremely helpful with some aid in comparing the different models given the LOO output for model fitORML2Interaction.

I compared that model to the following two models

Model fit ORML1

formula = Y ~ 1 + C + T + (1 | id) + (1 | Item)

Model fit ORML1interaction

formula = Y ~ 1 + C + T + C*T + (1 | id) + (1 | Item)

Model fit ORML2Interaction

formula = Y ~ 1 + C + T + C*T + (C + T + C*T | id) + (1 | Item)

Loo compare

loo_compare(fitORML2Interaction, fitORML1Interaction, fitORML1)
#                                         elpd_diff se_diff
# fitORML2Interaction        0.0          0.0 
# fitORML1                    -725.2          27.3 
# fitORML1Interaction   -731.2          28.2 

In a traditional regression I would treat T as a moderator for C. Would this have an impact on which model I choose.

Would anyone be able to expand on the benefits and tradeoffs for the different models?

Any help is much appreciated.

Best,
Dan

Bump…

If C is between subject it should not be included as a slope varying randomly by subject. Its presence in your model probably accounts for your low ESS sizes and bad pareto-k values. I would suggest the following “maximal” formula:

Y ~ 1 + C + T + C * T + (1 + T | id) + (1 | Item)

or, more concisely,

Y ~ C * T + (T | id) + (1 | Item)

Many thanks @andymilne!

I hope you don’t mind me also asking what I would gain from also including unequal variances, as in including

bf(Y ~ C * T + (T | id) + (1 | Item)) +
lf(disc ~ 0 + C + T + C*T)

Your help is much appreciated!

Best,
D

Certainly worth a try including unequal variances – you can check if it is worth including with a loo comparison (assuming all your pareto-k values come good). For predicting disc, I think you need to include the argument, cmc = FALSE (but please check the documentation, I might be wrong on that).

Many thanks @andymilne,

So, would you mind helping me interpret the disc parameters. Here are the coefficients

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]    -4.46      0.60    -5.69    -3.37 1.00      305      894
Intercept[2]    -3.55      0.50    -4.58    -2.64 1.00      303      986
Intercept[3]    -2.77      0.43    -3.65    -2.00 1.00      304      957
Intercept[4]    -2.10      0.37    -2.86    -1.45 1.00      318      904
Intercept[5]    -1.32      0.31    -1.96    -0.77 1.00      367      914
Intercept[6]     0.17      0.26    -0.34     0.67 1.00      740     1263
CCJ             -1.17      0.34    -1.87    -0.55 1.00      426      983
CCK             -1.79      0.37    -2.57    -1.15 1.00      356     1056
CCL             -2.69      0.44    -3.59    -1.91 1.00      321     1076
CCM             -3.62      0.53    -4.69    -2.66 1.00      310      931
TTB             -0.76      0.31    -1.39    -0.19 1.00      458     1019
TTC             -0.35      0.33    -1.02     0.28 1.00      573      917
CCJ:TTB          0.50      0.36    -0.16     1.22 1.00      573     1058
CCK:TTB          0.68      0.35     0.02     1.39 1.00      570     1298
CCL:TTB          0.75      0.34     0.11     1.45 1.00      521     1226
CCM:TTB          0.91      0.36     0.24     1.65 1.00      510     1127
CCJ:TTC          0.24      0.40    -0.52     1.06 1.00      661     1070
CCK:TTC          0.20      0.36    -0.50     0.93 1.00      711     1167
CCL:TTC          0.43      0.36    -0.25     1.17 1.00      588      975
CCM:TTC          0.54      0.37    -0.16     1.29 1.00      602     1078
disc_CCI        -0.87      0.14    -1.14    -0.59 1.00      470     1075
disc_CCJ        -0.37      0.17    -0.69    -0.02 1.02      282      760
disc_CCK        -0.27      0.14    -0.52     0.01 1.00      462     1428
disc_CCL        -0.20      0.14    -0.46     0.07 1.00      531     1387
disc_CCM        -0.28      0.14    -0.55     0.00 1.00      515     1559
disc_TTB         0.65      0.10     0.45     0.84 1.00     1184     2006
disc_TTC         0.27      0.10     0.07     0.47 1.00     1522     2362
disc_CCJ:TTB     0.13      0.25    -0.27     0.70 1.04      190      437
disc_CCK:TTB    -0.17      0.14    -0.44     0.10 1.00     1679     2271
disc_CCL:TTB    -0.17      0.13    -0.43     0.10 1.00     1365     2156
disc_CCM:TTB    -0.10      0.14    -0.37     0.17 1.00     1992     2745
disc_CCJ:TTC    -0.23      0.15    -0.52     0.06 1.00     2314     2856
disc_CCK:TTC     1.10      0.16     0.81     1.42 1.00     1979     2524
disc_CCL:TTC     1.10      0.16     0.79     1.43 1.00     1575     2580
disc_CCM:TTC     0.96      0.16     0.65     1.28 1.00     1368     1947

How would I interpret the disc parameters?

Also, would I be correct if would interpret the estimate for each level of the two factors as an effect size for movement on the latent standardised scale of the ordinal values?

Thanks!

These days I generally avoid interpreting the meanings of parameters directly (it is too easy to make mistakes with complex models) and make substantive inferences about effect sizes, and their evidence, from predictions. This is made easy in packages such as emmeans and marginaleffects. In the latter, which I know best, you can easily specify any set of conditions of interest, or contrast of conditions of interest, and see the mean (or median) of the predicted effect and its uncertainty. If these predictions are made on the latent (linear) scale (what marginaleffects calls the “link” scale) and you are using a probit link then these are, indeed, standardized effect sizes on the latent scale.

Thanks @andymilne!

I’ll have a look at the marginal effects package.

Would anyone be able to share what the disc values actually add in interpretive value?

Best,
Dan

The standard deviation of the predicted response on the latent scale is given by 1/exp(disc). So the SD for the CCJ condition, when T is at its reference, is given by 1/exp(-0.37) = 1.45; the SD for the CCI condition when T is at level TTB is 1/exp(-0.37 + 0.13) = 1.27. Generally, for coefficients (or sums of coefficients) < 0 the SD > 1; for coefficients (or sums of coefficients) > 0, the SD is < 1. So, the more positive the coefficient, the greater the consistency in responses. More info here Understanding the disc parameter in ordered logit models - #9 by matti

1 Like