Hi,
I am working with some cumulative family ordinal data, and I have some questions on interpreting some values for thresholds and grouplevel sd.
My data contains a scaled value (n= 96, mean = 0, sd = 1), an ordinal category (n = 4), and a grouping variable (species in this case, n = 24). See end of the post for data simulation code.
testdf
# A tibble: 96 x 3
val species cat
<dbl> <chr> <int>
1 4.02 a 1
2 2.93 a 1
3 2.23 a 1
4 2.43 a 1
5 0.225 b 1
6 0.0597 b 1
7 0.0194 b 1
8 0.0591 b 1
9 0.103 c 1
10 0.545 c 1
# â€¦ with 86 more rows
Estimating a simple ordinal regression works fine. Itâ€™s not a great fit to the data, and there isnâ€™t a lot of separation between some categories, but all the values are reasonable and make sense given that the predictor is scaled to N(0,1).
tm.1 < brm(
cat ~ val, data = testdf,
family = cumulative(link = "probit"),
chains = 2, cores = 2,
control = list(adapt_delta = 0.99, max_treedepth = 15)
)
summary(tm.1)
# summary cropped for space
PopulationLevel Effects:
Estimate Est.Error l95% CI u95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] 1.04 0.16 1.37 0.73 1.00 2255 2505
Intercept[2] 0.46 0.14 0.72 0.18 1.00 3618 2899
Intercept[3] 0.00 0.14 0.27 0.27 1.00 4876 2415
val 0.20 0.10 0.00 0.39 1.00 3127 2489
When I add in the species grouping variable as a group level effect, everything changes.
tm.2 < brm(
cat ~ val + (1  species), data = testdf,
family = cumulative(link = "probit"),
chains = 2, cores = 2,
control = list(adapt_delta = 0.99, max_treedepth = 15)
)
summary(tm.2)
# summary cropped for space
GroupLevel Effects:
~species (Number of levels: 24)
Estimate Est.Error l95% CI u95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 16.61 7.47 7.53 36.54 1.00 1142 1343
PopulationLevel Effects:
Estimate Est.Error l95% CI u95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1] 12.95 5.87 27.55 4.93 1.00 968 1331
Intercept[2] 3.31 2.84 9.57 1.62 1.00 849 1661
Intercept[3] 3.31 2.91 1.40 10.26 1.00 710 1499
val 0.16 0.77 1.32 1.72 1.00 1977 1694
This clearly isnâ€™t a very informative model, and the estimates have tremendous uncertainty. However I am confused by a few things.

I thought thresholds (Intercepts) were estimated threshold points between categories in the predictor variable. In model 1 these values make sense (1.04, etc). But the predictor variable is scaled, so how can an estimate be so far from zero (like between 22.7 and 4.9 for Intercept[1])? I would instead expect values between 3 and 3, even though they will clearly overlap a lot.

The grouplevel variance also seems very high (between 7 and 35). Again, I understand the large uncertainty, but I do not understand the absolute size of the values given that the predictor is scaled to 0.
Any guidance on why these values are so high? Am I misinterpreting the values of the intercepts/thresholds and the sd(group_level)
? Itâ€™s easy to find example datasets where everything works, but itâ€™s harder to find examples where things go wrong ;)
Thanks!
Data simulation
I wanted to make these values as close to my data as possible
seed = 11
vm < c(2.107, 0.140, 0.702, 0.371, 0.346, 0.539, 1.041, 0.548, 0.080, 0.316, 1.051, 0.836, 0.191, 0.735, 0.846, 0.176, 0.396, 0.500, 0.272, 0.482, 1.329, 0.227, 0.318, 0.213)
vsd < c(1.065, 0.237, 0.693, 0.570, 0.502, 1.058, 0.916, 0.885, 0.619, 0.632, 0.787, 0.511, 0.954, 0.968, 0.285, 0.975, 0.998, 0.390, 0.500, 0.161, 1.030, 0.830, 0.400, 0.308)
d1 < rnorm(96, mean=rep(vm,each=4), sd=rep(vsd, each = 4))
d2 < rep(letters[1:24], each = 4)
vc < c(16,16,16,48)
d3 < rep(1:4, vc)
testdf < dplyr::bind_cols(val = d1, species = d2, cat = d3)
brms 2.14.4
mac BigSur