 # Multilevel ordinal model - large values in threshold and group level sd estimates?

Hi,

I am working with some cumulative family ordinal data, and I have some questions on interpreting some values for thresholds and group-level sd.

My data contains a scaled value (n= 96, mean = 0, sd = 1), an ordinal category (n = 4), and a grouping variable (species in this case, n = 24). See end of the post for data simulation code.

``````testdf
# A tibble: 96 x 3
val species   cat
<dbl> <chr>   <int>
1 -4.02   a           1
2 -2.93   a           1
3 -2.23   a           1
4 -2.43   a           1
5  0.225  b           1
6 -0.0597 b           1
7 -0.0194 b           1
8  0.0591 b           1
9 -0.103  c           1
10 -0.545  c           1
# … with 86 more rows
``````

Estimating a simple ordinal regression works fine. It’s not a great fit to the data, and there isn’t a lot of separation between some categories, but all the values are reasonable and make sense given that the predictor is scaled to N(0,1).

``````tm.1 <- brm(
cat ~ val, data = testdf,
chains = 2, cores = 2,
control = list(adapt_delta = 0.99, max_treedepth = 15)
)
summary(tm.1)
# summary cropped for space
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.04      0.16    -1.37    -0.73 1.00     2255     2505
Intercept    -0.46      0.14    -0.72    -0.18 1.00     3618     2899
Intercept    -0.00      0.14    -0.27     0.27 1.00     4876     2415
val              0.20      0.10     0.00     0.39 1.00     3127     2489
``````

When I add in the species grouping variable as a group level effect, everything changes.

``````tm.2 <- brm(
cat ~ val + (1 | species), data = testdf,
chains = 2, cores = 2,
control = list(adapt_delta = 0.99, max_treedepth = 15)
)
summary(tm.2)
# summary cropped for space
Group-Level Effects:
~species (Number of levels: 24)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)    16.61      7.47     7.53    36.54 1.00     1142     1343

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept   -12.95      5.87   -27.55    -4.93 1.00      968     1331
Intercept    -3.31      2.84    -9.57     1.62 1.00      849     1661
Intercept     3.31      2.91    -1.40    10.26 1.00      710     1499
val              0.16      0.77    -1.32     1.72 1.00     1977     1694
``````

This clearly isn’t a very informative model, and the estimates have tremendous uncertainty. However I am confused by a few things.

1. I thought thresholds (Intercepts) were estimated threshold points between categories in the predictor variable. In model 1 these values make sense (-1.04, etc). But the predictor variable is scaled, so how can an estimate be so far from zero (like between -22.7 and -4.9 for Intercept)? I would instead expect values between -3 and 3, even though they will clearly overlap a lot.

2. The group-level variance also seems very high (between 7 and 35). Again, I understand the large uncertainty, but I do not understand the absolute size of the values given that the predictor is scaled to 0.

Any guidance on why these values are so high? Am I misinterpreting the values of the intercepts/thresholds and the `sd(group_level)`? It’s easy to find example datasets where everything works, but it’s harder to find examples where things go wrong ;)

Thanks!

Data simulation
I wanted to make these values as close to my data as possible

``````seed = 11
vm <- c(-2.107, 0.140, -0.702, 0.371, -0.346, -0.539, 1.041, -0.548, 0.080, 0.316, 1.051, 0.836, -0.191, 0.735, -0.846, 0.176, 0.396, 0.500, -0.272, -0.482, -1.329, 0.227, 0.318, 0.213)
vsd <- c(1.065, 0.237, 0.693, 0.570, 0.502, 1.058, 0.916, 0.885, 0.619, 0.632, 0.787, 0.511, 0.954, 0.968, 0.285, 0.975, 0.998, 0.390, 0.500, 0.161, 1.030, 0.830, 0.400, 0.308)
d1 <- rnorm(96, mean=rep(vm,each=4), sd=rep(vsd, each = 4))
d2 <- rep(letters[1:24], each = 4)
vc <- c(16,16,16,48)
d3 <- rep(1:4, vc)
testdf <- dplyr::bind_cols(val = d1, species = d2, cat = d3)
``````

brms 2.14.4
mac BigSur

1 Like

Hi, sorry for not getting to you earlier, your qeustion is relevant and well written.

I think the problem is that the model parameters cannot be informed by the data as the species don’t really have any variability in responses - when I look at

``````testdf %>% group_by(species) %>% summarise(min(cat), max(cat))
``````

I get:

``````   species `min(cat)` `max(cat)`
* <chr>        <int>      <int>
1 a                1          1
2 b                1          1
3 c                1          1
4 d                1          1
5 e                2          2
6 f                2          2
7 g                2          2
8 h                2          2
9 i                3          3
10 j                3          3
# ... with 14 more rows
``````

So this means that I can get the exact same likelihood by e.g. lowering the threshold between responses 1 and 2 and simultaneously lowering the varying intercept term for species a - d. When this happens, the joint distribution of the parameters is constrained by data, but the marginal distributions are only constrained by your prior. Since you use the default `student_t(3, 0, 2.5)` prior for both intercepts and SDs the constrains are not very tight:

``````qstudent_t(c(0.025,0.975), df = 3) * 2.5
#  -7.956116  7.956116
``````

The posterior variability in your case is a bit higher than that and I admit I don’t understand completely why this is the case, maybe just because of the interactions with `val` ? Anyway the pairs plot IMHO shows the pathologies quite neatly:

`````` bayesplot::mcmc_pairs(tm.2, pars = c("b_val", "b_Intercept", "b_Intercept", paste0("r_species[", c("a","f"),",Intercept]")))
`````` This is also why you needed such high adapt_delta and max_treedepth to fit the model without issues.

Best of luck with your model!

1 Like

Thank you @martinmodrak for the thought out response. This makes a lot of sense. I am going to continue this conversation at the forum post Categorical family: group-levels not present in each response category - is this ok? because the two problems presented in this post and the other one stem from the same issue: “complete separation” in group-level predictors in an ordinal/categorical model.

1 Like