Predict Population-level probability of an ordinal probit model

I’ll try to predict (from my estimated model in brms) the population-level probability for responding in the categories 1,2,3,4,5 or 6 (dependend variable). Is this possible? I need them to do some ROC-Curves. I tried to manually calculate the corresponding probabiliets via the cumulative distribution and the model parameters but I do get other results then from the model.

test <- predict(mod, data.frame(predictor1 = 0,predictor2 =0, discParameter1 = 0, discParameter2 = 0, SubjectID =NA))

Is this the way to get the population-level probabilities for my model by saying SubjectID = NA?

Thanks for your reply :)

the question is a bit hard to answer without seeing the actual model. Assuming SubjectID is the only varying intercept (random effect) in the model, you can choose from at least three basic prediction tasks that could be considered “population”:

  1. Predict the population mean without taking the varying intercept into account, i.e. ignoring the between-subject variability. This could be done via the re_formula argument.
  2. Predict for a hypothethical previously unseen subject, drawn from the same population, i.e. drawing a completely new varying intercept for each prediction from normal(0, sd_SubjectID). This can be done by setting SubjectID = "any_previosly_unseen_value" and using allow_new_levels = TRUE
  3. Predict for a single subject randomly chosen from the population, i.e. taking one of the fitted varying intercepts (different for each sample). This can be done using the steps in 2) and also setting sample_new_levels = "old_levels"

Which of those makes the most sense for you depends on the actual question you are asking, no general answer here. There are also some more exotic options which I am leaving out for simplicity.

Speaking specifically about ROC curves (which I admit I am not a big fan of - are you sure you need ROC curves?) - wouldn’t it make more sense to make predictions for the full dataset, and then calculate separate ROC curve for each posterior sample? The ensemble of curves then express the model uncertainty about the ROC curve. Whether this is sensible obviously depends on the question you are asking, so once again don’t want to force this on you.

Best of luck with your model!

1 Like

Thank you very much for your reply! :) This helped a lot, I’m gonna try to use re_formula.