As I am dealing with 7-item Likert data, I am trying to get my hand around cumulative probit models, and how they are implemented in brms
I don’t have much prior information, so I planned to use the technique mentioned by @Solomon in
this post: Understanding odd estimates from cumulative probit model - #14 by Solomon
However, my actual data are quite skewed, but do not contain the highest ranked category.
This means that brms
will not allow me to specify an intercept for \tau_6
Running the code:
validate_prior(prior = c(prior(normal(-1.068, 1), class = Intercept, coef = 1),
prior(normal(-0.566, 1), class = Intercept, coef = 2),
prior(normal(-0.18, 1), class = Intercept, coef = 3),
prior(normal( 0.18, 1), class = Intercept, coef = 4),
prior(normal( 0.566, 1), class = Intercept, coef = 5),
prior(normal( 1.068, 1), class = Intercept, coef = 6),
prior(exponential(1), class = sd)),
data=data_without_cat_7,
family=cumulative(probit),
formula=response ~ 1 + (1 | id) + (1 | name))
yields the error message:
Error: The following priors do not correspond to any model parameter: Intercept_6 ~ normal(1.068, 1)
I can of course introduce synthetic data to fix the issue, e.g:
d <- rbind(data_without_cat_7, data.frame(response=c(7), id=c("a"), name=c("extra")))
validate_prior(prior = c(prior(normal(-1.068, 1), class = Intercept, coef = 1),
prior(normal(-0.566, 1), class = Intercept, coef = 2),
prior(normal(-0.18, 1), class = Intercept, coef = 3),
prior(normal( 0.18, 1), class = Intercept, coef = 4),
prior(normal( 0.566, 1), class = Intercept, coef = 5),
prior(normal( 1.068, 1), class = Intercept, coef = 6),
prior(exponential(1), class = sd)),
data=d,
family=cumulative(probit),
formula=response ~ 1 + (1 | id) + (1 | name))
This gives PPC checks that align reasonably well with my assumption of equal probability among responses (and it also shows the skewedness of the actual responses):
But this would of course not work when I want to analyse the actual data set.
It does not make sense (to me) to add data to “fill out categories”, but I would still like the model to have some indication into how many 6 responses and 7 responses I could expect, given the posterior.
What am I missing? Appreciate your thoughts on the issue (and as you probably know, I am very much a newbie on ordinal regression using brms, though I found this blog post immensely useful: Notes on the Bayesian cumulative probit | A. Solomon Kurz
- Operating System: Ubuntu 22.04, R 4.5.0
- brms Version: 2.22.0