Ordinal-ish model: setting priors and pp_check

Hello everyone,

I have some data from an experiment in which the participants were given a Likert-like scale. I say Likert-like because it is actually just a continuous scale with ticks at the numbers 1 to 7. So, participants can theoretically respond at any point along the continuous (but bounded) scale. In practice, responses tend to be pulled toward where the ticks are.

I’ve tried to run a gaussian model with a uniform prior between the minimum and maximum of the scaled variable, just to see how it works. The pp_check trace shows that the predictions are not so great, as they don’t replicate the spikes where each tick was in the real data:


[variable 1]


[variable 2]

It looks like the predictions are worse for the skewed variable (for which many participants answered 5 or 6 out of 7). Is there any way I can improve the fit here?

  • Operating System: Mac OS Catalina
  • brms Version: 2.15.0

Morning,

A couple things that will help folks troubleshoot:
Can you most the full model code or call along with some data (snippet or simulated)?
Can you also print out the get_priors and post that here?

Generally speaking I am not a fan of uniform priors.

2 Likes

Data snippet:
likert_data_snippet.csv (7.4 KB)

Model:

# First I will center the rating variable (which goes from 1-7)
# by subtracting 4 and dividing by 3. Since this is a bounded distribution,
# I am setting a uniform prior with upper and lower bounds of -1 and 1 for the
# centered/scaled rating. The following is for rating one:

data$rating1_ctr <- (data$rating1-4)/3

mypriors <- c(set_prior("uniform(-1,1)", class = "Intercept"),
                set_prior("normal(0,1)", class = "b"))

mod.BayesRating1 <- brm(formula =  rating1_ctr ~ 1 + 
                                     + scale(block) + (1 | ID),
                                 data= data, family = gaussian(link = "identity"),
                                 prior = mypriors, sample_prior = "yes",
                                 warmup = 1000, iter = 5000, chains = 2,
                                 inits= "0", seed = 123)

As I’ve upgraded to BRMS 2.16.1 now, I think the function is now prior_summary:

                prior     class        coef        group resp dpar nlpar bound       source
          normal(0,1)         b                                                        user
          normal(0,1)         b  scaleblock                                    (vectorized)
        uniform(-1,1) Intercept                                                        user
 student_t(3, 0, 2.5)        sd                                                     default
 student_t(3, 0, 2.5)        sd                       ID                       (vectorized)
 student_t(3, 0, 2.5)        sd   Intercept           ID                       (vectorized)
 student_t(3, 0, 2.5)     sigma                                                     default

And additionally, I’ve graphed the prior and posterior distributions together using prior_draws() and as_draws_df():

1 Like

There are several things you could do. Unless you make your model an ordinal model, I think it is not easy to capture the peaks around the different numbers which indeed, most people will gravitate towards. You could simply bin your data so that you do end up with just an ordinal scale from 1-7 and run an ordinal regression - then you will have the different peaks although you will miss where some people did not really answer the exact number. How important do you think it is to capture that, realistically?

Other options could include beta regression or what another user on the forum here has been developing @saudiwin - ordinal beta regression. This matches the kind of data you have well: the foum post about it is here (New paper using Stan/brms to estimate feeling thermometers/VAS scales), which includes links to the paper and a vignette using the approach in brms.

The lack of your model capturing the peaks in the data is not something the prior will solve - if you are estimating with a gaussian model then the predicted values will follow a gaussian distribution.

2 Likes

Hello, thank you for the suggestions! I tried a few different ways, and I thought it would be helpful to show what happened for posterity.

For reference, here is an example of the raw rating data:

image

And we can see that some participants have a bigger spread of scores than others:

image

Here is a regular ordinal model, rounding to the nearest tick (note, I switched my pp_check to type “hist” so that the predictions don’t look like squiggly lines):

image
[real data: dark blue, modeled data: light blue]

Looks pretty good, but we are losing some information, since some participants had a very narrow range of responses, and likely made use of points in between ticks. In fact, in the raw data above, you can see there are some tiny peaks at what looks to be halfway between ticks.

Doing an ordered beta regression picks out the peaks at zero and one, but doesn’t account for all the other spikes. Maybe there would be if I knew more about ordered beta regressions, but I don’t think I have the time to figure that out.


[real data: dark blue, modeled data: light blue]

What I like best for now is a normal ordinal regression with finer-grained categories (multiplying my variable by two before rounding). This seems to capture the fact that there are some responses between the ticks, while still working within brms’ built-in modeling options.

image
[real data: dark blue, modeled data: light blue]

Now, of course once you start on this road, there are infinite “resolutions” of ordinal data you could try…

If you rescale the data to range from 0 to 1, it seems like the zero-one inflated beta distribution would be a good starting point. But you would want to use the custom likelihood functionality (see the Define Custom Response Distributions with brms vignette) to include five more inflation points corresponding to the inner ticks (2 through 6). Unfortunately, my custom likelihood chops aren’t advanced enough to help with your specific use case.

@chelseaparlett, this might be up your alley. @matti, maybe you too.

3 Likes

That sounds interesting, though I’m not sure I have the custom likelihood chops for that either ;)

A related question I had was how to set the priors for these ordinal-type variables.

I understand that the output has multiple intercepts corresponding to the thresholds. I tried priors for “Intercept” generally at Normal(0,1) and Normal(0,5) – I guess this is in the logit scale as for other brms models.

However, this means that the left- and right-most intercepts would be pretty far away from this prior, right? Should I specify a different prior for each intercept, as I’ve seen here?

I think Solomon’s idea is super interesting and creative! You can think of what he suggested as a mixture model with many components.

So the first part of your model models a simplex (a vector of probabilities that add up to 1) that represents the probability of being in each “group”. One group is the continuous responses, then one group for each value you’d like to inflate.

You can set a Dirichlet prior for this simplex.

That’s all you have to do for the groups that are inflated–I believe. Then for the continuous group, you’d fit it the same way you’d fit a typical beta regression (using a link function and beta likelihood).

However, where I am SUPER unhelpful is that I don’t know is how to do this in brms, this might be one of those cases where it might be easier to get the stan code from a ZOIB model in brms and modify the stan code, because inflated models (aka this mixture model) are easier to code in stan than you’d expect. And this special case would look super similar to any ZOIB-in-stan tutorials, you’d just have a larger simplex and more if statements in your likelihood.

Let me see if I can find anything that would be helpful to do this in Stan.

Edit:

Here’s some Stan Code for a ZOIB that I simplified from one of my own models. I didn’t have time to simulate data and make sure I didn’t make any errors but I did try to add comments indicating the parts you would need to change to accommodate the larger number of mixtures. Someone more knowledgeable than me about ZOIB and Stan should probably check this over as there may be errors!

2 Likes

IMO, either approach to the threshold priors is fine, particularly if you have a decent sample size. But if you have good prior work to base the threshold priors on, by all means use it.

1 Like

Thanks; I have a pretty good sample size and not much prior information about the thresholds, so I’ll stick with the one prior for now, but this is good to know. Thanks again for your help!

Looks to me like your ‘rounded ordinal’ model might well capture quite a lot of the important info. The ZOIB model and extended ZOIB with custom thresholds is really cool, although the multiple parameters become difficult to interpret and I imagine more so with even more inflation points? So you can decide for yourself the costs/benefits given your use case!

1 Like