Likert scale data

I’ve started playing around with brms and Likert scale data and I’ve stumbled upon something that I need someone with a bigger brain to sanity check for me. I’ve read @paul.buerkner’s tutorial paper (which I can highly recommend) but I couldn’t find a solution to my problem.

If I understand things correctly I should model Likert scale (e.g., 1…5) data as:

Response | resp_cat(5) ~ ...
(see what I did with the … @paul.buerkner ;)

But what happens if, among all subjects, no one answered, e.g., 3, and all answers are either 1-2 or 4-5? Or for that matter (a bit more common) no one has provided the answer 1 or 5? We’re not talking about missing data here, i.e., it’s missing but it should be missing…

resp_cat() requires a number, i.e., the number of categories, but does it figure out by itself that, e.g., 3 is missing, or does that not matter? Intuitively, I feel that it’s a big difference if nobody has answered 1 or 3, and that it affects(?) the cumulative prob.

In formula-helpers.R the code for resp_cat() only checks things like, e.g., is.numeric(x), while I would expect that I should be forced to list each category explicitly, i.e., c(seq(1,5)).

Any insight would be nice! Thanks :)

If your response is integer valued (instead of an ordered fatcor), they are passed as is. Example:

df <- data.frame(y = c(2, 3, 4))
make_standata(y | cat(5) ~ 1, data = df, family = cumulative())
1 Like

What is the difference between using y | cat(X) and using as.ordered(y)? I also have likert type data for outcome variables and used as.ordered before. Is this wrong?

Both work as long the same way as long as all possible categories appear in the data.

Great - that’s a relief. Thanks!