Likert scale data


I’ve started playing around with brms and Likert scale data and I’ve stumbled upon something that I need someone with a bigger brain to sanity check for me. I’ve read @paul.buerkner’s tutorial paper (which I can highly recommend) but I couldn’t find a solution to my problem.

If I understand things correctly I should model Likert scale (e.g., 1…5) data as:

Response | resp_cat(5) ~ ...
(see what I did with the … @paul.buerkner ;)

But what happens if, among all subjects, no one answered, e.g., 3, and all answers are either 1-2 or 4-5? Or for that matter (a bit more common) no one has provided the answer 1 or 5? We’re not talking about missing data here, i.e., it’s missing but it should be missing…

resp_cat() requires a number, i.e., the number of categories, but does it figure out by itself that, e.g., 3 is missing, or does that not matter? Intuitively, I feel that it’s a big difference if nobody has answered 1 or 3, and that it affects(?) the cumulative prob.

In formula-helpers.R the code for resp_cat() only checks things like, e.g., is.numeric(x), while I would expect that I should be forced to list each category explicitly, i.e., c(seq(1,5)).

Any insight would be nice! Thanks :)


If your response is integer valued (instead of an ordered fatcor), they are passed as is. Example:

df <- data.frame(y = c(2, 3, 4))
make_standata(y | cat(5) ~ 1, data = df, family = cumulative())