Cumulative models for multiple likert items

#1

Thanks to a recent tutorial from @paul.buerkner and Matti Vuorre, I’m realizing that it should be possible to fit ordinal models in brms for an analysis of multiple questions per informant that I’m assisting with (similar to the discussion on page 19/20 of the tutorial).

In essence, participants are asked to respond to a series of questions about funding priorities for 10 budget items. So for question q, person i provides a response on a 5-point likert scale. We hypothesize that these responses vary as a function of an i-level attribute denoted x.

Furthermore, the funding priorities can be broken down into two categories: military and educational. So the dataset looks roughly like this (showing responses for 2 of the n informants).

q y x i type
1 4 1 A Military
2 5 1 A Education
3 2 1 A Military
4 3 1 A Education
5 1 1 A Military
6 5 1 A Education
7 3 1 A Military
8 5 1 A Education
9 3 1 A Military
10 5 1 A Education
1 2 0 B Military
2 5 0 B Education
3 5 0 B Military
4 2 0 B Education
5 3 0 B Military
6 1 0 B Education
7 4 0 B Military
8 5 0 B Education
9 3 0 B Military
10 2 0 B Education

Now imagine that the researcher has two questions:

  1. The attribute, x, predicts the emphasis that the individual gives to the different types of funding.
  2. Relatedly, within individuals, people who rate educational items more highly will rate military items lower even after accounting for variation explained by x.

The first question seems like it can be profitably investigated with a brm model:
brm ( y ~ x * type + (1|i) + (1|q), family = cumulative("probit")

To examine the second question, my intuition is to consider a random coefficients model:
brm ( y ~ x * type + (1 + type|i) + (1|q), family = cumulative("probit")
And then look at the covariance of the i-level intercept and coefficients for insight into the correlation across those two types of questions.

Are there other modeling strategies that I should be considering, though?

1 Like

#2

Based on your explanations, the models you are considering look reasonable to me.

1 Like

#3

So multiple items works great using the multi-level method! But, is it possible to go multivariate too? 1. To estimate correlations between scales; and 2. for different sized scales?

This is approaching CFA/SEM (which I understand is in the brms pipeline) - but I’ve had a crack and can estimate two equal-sized scales and get the same/similar parameter estimates as the individual models - e.g :

ID ITEM B1 B2 FACTOR1 FACTOR2
1 1 0 1 7 2
1 2 0 1 3 5
1 3 0 1 6 7
1 4 0 1 3 4
2 1 1 0 6 2
2 2 1 0 2 5
2 3 1 0 4 4
2 4 1 0 3 2
modMV <- brm(mvbind(FACTOR1, FACTOR2) ~ 1 + B1*B2 + (1 | ID) + (1 | ITEM),
 data = dat, family = cumulative("probit")

The complication will be including correlations (rescor defaults to FALSE and breaks when set to TRUE), and imputing missing data if the scales are different sizes (mi() doesn’t support cumulative family). Is this theoretically possible by tweaking the Stan code?

Thanks!
Oli

0 Likes

#4

I am sorry, I don’t fully understand your question. Could you perhaps try to explain it once more?

0 Likes

#5

Apologies Paul, and thanks for asking for clarification.

I was wondering whether it is (or will be possible) to run multivariate ordinal analyses.

  1. Say for instance I have three factors, each with 5 likert items. Is it possible to estimate each factor, and the correlations between the factors?

This is possible for gaussian models in brms using MVBind and rescor = TRUE, but not for cumulative probit models. I was wondering whether this can be done by altering the stan code.

  1. If it is, then is it possible when I have three factors with 5,7, and 9 items? Again, I think this is possible using mi() for gaussian but not cumulative probit models.

I hope that makes sense!

Oli

0 Likes

#6
  1. Estimating residual correlations for ordinal models requires the CDF of the multivariate normal (of multivariate logistic) distribution, which is currently not available in Stan. As soon as we have that, I will try to make multivariate ordinal models with residual correlations available in brms. You may still model varying effects as correlated across univariate models.

  2. Not sure what mi() has to do with different number of categories, but in general, modeling ordinal responses with different number of categories (is that what you mean by “items”?) within the same multivariate model poses no problem as each response just gets its own set of thresholds.

0 Likes

#7

Excellent! Thanks for letting me know!

As for #2 - by items I mean the observations that make up the factor.

One factor might be made up for 8 individual test items (each with 7 likert categories), whereas another factor might have 5 items (with the same number of likert categories).

When this is structured in long format (one column for each factor), there are 3 NA’s in F2 for each participant. These could be treated as missing data under metric models, but not ordinal ones.

Best wishes,

Oli

0 Likes

#8

Yeah, I see the problem. For that purpose you could use the subset addition argument of brms.
See ?resp_subset.

0 Likes

#9

Awesome! Thanks! I’ll give that a shot. It means I can now just run one model for my measure, rather than four!

0 Likes