Cumulative models for multiple likert items

jeremy.koster · January 14, 2019, 4:45pm

Thanks to a recent tutorial from @paul.buerkner and Matti Vuorre, I’m realizing that it should be possible to fit ordinal models in brms for an analysis of multiple questions per informant that I’m assisting with (similar to the discussion on page 19/20 of the tutorial).

In essence, participants are asked to respond to a series of questions about funding priorities for 10 budget items. So for question q, person i provides a response on a 5-point likert scale. We hypothesize that these responses vary as a function of an i-level attribute denoted x.

Furthermore, the funding priorities can be broken down into two categories: military and educational. So the dataset looks roughly like this (showing responses for 2 of the n informants).

q	y	x	i	type
1	4	1	A	Military
2	5	1	A	Education
3	2	1	A	Military
4	3	1	A	Education
5	1	1	A	Military
6	5	1	A	Education
7	3	1	A	Military
8	5	1	A	Education
9	3	1	A	Military
10	5	1	A	Education
1	2	0	B	Military
2	5	0	B	Education
3	5	0	B	Military
4	2	0	B	Education
5	3	0	B	Military
6	1	0	B	Education
7	4	0	B	Military
8	5	0	B	Education
9	3	0	B	Military
10	2	0	B	Education

Now imagine that the researcher has two questions:

The attribute, x, predicts the emphasis that the individual gives to the different types of funding.
Relatedly, within individuals, people who rate educational items more highly will rate military items lower even after accounting for variation explained by x.

The first question seems like it can be profitably investigated with a brm model:
brm ( y ~ x * type + (1|i) + (1|q), family = cumulative("probit")

To examine the second question, my intuition is to consider a random coefficients model:
brm ( y ~ x * type + (1 + type|i) + (1|q), family = cumulative("probit")
And then look at the covariance of the i-level intercept and coefficients for insight into the correlation across those two types of questions.

Are there other modeling strategies that I should be considering, though?

paul.buerkner · January 15, 2019, 8:50am

Based on your explanations, the models you are considering look reasonable to me.

Oli_J · April 9, 2019, 5:02pm

So multiple items works great using the multi-level method! But, is it possible to go multivariate too? 1. To estimate correlations between scales; and 2. for different sized scales?

This is approaching CFA/SEM (which I understand is in the brms pipeline) - but I’ve had a crack and can estimate two equal-sized scales and get the same/similar parameter estimates as the individual models - e.g :

ID	ITEM	B1	B2	FACTOR1	FACTOR2
1	1	0	1	7	2
1	2	0	1	3	5
1	3	0	1	6	7
1	4	0	1	3	4
2	1	1	0	6	2
2	2	1	0	2	5
2	3	1	0	4	4
2	4	1	0	3	2

modMV <- brm(mvbind(FACTOR1, FACTOR2) ~ 1 + B1*B2 + (1 | ID) + (1 | ITEM),
 data = dat, family = cumulative("probit")

The complication will be including correlations (rescor defaults to FALSE and breaks when set to TRUE), and imputing missing data if the scales are different sizes (mi() doesn’t support cumulative family). Is this theoretically possible by tweaking the Stan code?

Thanks!
Oli

paul.buerkner · April 15, 2019, 10:25am

I am sorry, I don’t fully understand your question. Could you perhaps try to explain it once more?

Oli_J · April 16, 2019, 9:20am

Apologies Paul, and thanks for asking for clarification.

I was wondering whether it is (or will be possible) to run multivariate ordinal analyses.

Say for instance I have three factors, each with 5 likert items. Is it possible to estimate each factor, and the correlations between the factors?

This is possible for gaussian models in brms using MVBind and rescor = TRUE, but not for cumulative probit models. I was wondering whether this can be done by altering the stan code.

If it is, then is it possible when I have three factors with 5,7, and 9 items? Again, I think this is possible using mi() for gaussian but not cumulative probit models.

I hope that makes sense!

Oli

paul.buerkner · April 16, 2019, 10:22am

Estimating residual correlations for ordinal models requires the CDF of the multivariate normal (of multivariate logistic) distribution, which is currently not available in Stan. As soon as we have that, I will try to make multivariate ordinal models with residual correlations available in brms. You may still model varying effects as correlated across univariate models.
Not sure what mi() has to do with different number of categories, but in general, modeling ordinal responses with different number of categories (is that what you mean by “items”?) within the same multivariate model poses no problem as each response just gets its own set of thresholds.

Oli_J · April 16, 2019, 11:30am

Excellent! Thanks for letting me know!

As for #2 - by items I mean the observations that make up the factor.

One factor might be made up for 8 individual test items (each with 7 likert categories), whereas another factor might have 5 items (with the same number of likert categories).

When this is structured in long format (one column for each factor), there are 3 NA’s in F2 for each participant. These could be treated as missing data under metric models, but not ordinal ones.

Best wishes,

Oli

paul.buerkner · April 16, 2019, 11:36am

Yeah, I see the problem. For that purpose you could use the subset addition argument of brms.
See ?resp_subset.

Oli_J · April 17, 2019, 7:42am

Awesome! Thanks! I’ll give that a shot. It means I can now just run one model for my measure, rather than four!

ian.macdonald · July 29, 2019, 2:03pm

@Oli_J, would you mind sharing your resultant model syntax for these multivariate ordinal models (with different item numbers for each response measure)? I have a similar modeling challenge and am unsure how to utilise the resp_subset argument. Cheers!

Oli_J · August 1, 2019, 10:14pm

Hello! I would if I could, but I couldn’t get it working :-( I’m going to be revisiting my analysis though, and if I am successful I will of course post it one here!

ian.macdonald · August 6, 2019, 8:04pm

Thanks for the update, and good luck. I’ll post here as well if I get it working.

Rui_Chaves · September 13, 2019, 2:02pm

Hi,
Can anyone illustrate how a BRMS model would be fit if the goal is simply to measure the correlation between two Likert variables? Assume they are both 1-7, for example. I tried using mo() on the dependent variable, but that resulted in an error. Only the independent variable could be passed this way.

Thank you.

ian.macdonald · September 14, 2019, 5:01pm

Perhaps this blog post may help you? The model formula used in the discussed examples is: mvbind(x, y) ~ 1, where the residual correlation between x and y is the parameter of interest. That said, the example uses continuous data, and the brms multivariate tutorial states estimating residual correlations is only possible with a “multivariate normal (or student-t) model”.

paul.buerkner · September 14, 2019, 5:55pm

This does unfortunatley not work with cumulative models as we don’t have a CDF function of the multivariate normal distribution in Stan yet, which would be relevant to use as the latent variable distribution. In other words, this is not yet possible in brms but may be possible in the future.

Rui_Chaves · September 15, 2019, 3:01pm

Thank you ian.macdonald and paul.buerkner.

Rui_Chaves · September 18, 2019, 6:37pm

Actually, the next best thing would be to fit:

brm(Likert1 ~ mo(Likert2) + (1|Var), data=f, family=cumulative(link = “logit”, threshold=“flexible”))

Can’t I take the independent variable’s log-odds coefficient to be an approximation of how the two likert variables are correlated?

paul.buerkner · September 18, 2019, 7:59pm

I dont know about the properties of such an estimate so I would be careful recommending it.

Rui_Chaves · September 21, 2019, 4:19pm

I understand. Likert1 consists of satisfaction ratings (1 = very unsatisfied, 7 = very satisfied) and Likert2 consists of recommendation ratings (1 = very unlikely to recommend product to others, 7 = very likely to recommend product to others). Var corresponds to cost, in dollars.

brm(Likert1 ~ mo(Likert2) + (1|Var), data=f, family=cumulative(link = “logit”, threshold=“flexible”))

I just want to understand this kind of model better, I guess. Any suggestions and help welcome.

paul.buerkner · September 21, 2019, 4:22pm

Thanks but that’s not what I mean.

I just dont know what is implied by using the mo prexictor with the cumulative family and what properties the related coefficient has in relation to what you want. So I would not use this for the purpose you have in mind.

Topic		Replies	Views
Ordinal likert data and model selection Modeling loo , ordinal-response , model-selection , brms	0	457	June 6, 2023
Model selection and multiple likert ordinal regression brms loo	3	572	September 6, 2023
Brms cumulative model with varying response categories brms	2	129	June 23, 2024
Help with ordinal regression model with multiple likert items brms specification	7	743	December 20, 2020
Question about family = cumulative in brms Modeling ordinal-response , priors , brms	5	1695	April 4, 2023

Cumulative models for multiple likert items

Related topics