# Predict Population-level probability of an ordinal probit model

I’ll try to predict (from my estimated model in brms) the population-level probability for responding in the categories 1,2,3,4,5 or 6 (dependend variable). Is this possible? I need them to do some ROC-Curves. I tried to manually calculate the corresponding probabiliets via the cumulative distribution and the model parameters but I do get other results then from the model.

``````test <- predict(mod, data.frame(predictor1 = 0,predictor2 =0, discParameter1 = 0, discParameter2 = 0, SubjectID =NA))
``````

Is this the way to get the population-level probabilities for my model by saying SubjectID = NA?

Hi,
the question is a bit hard to answer without seeing the actual model. Assuming `SubjectID` is the only varying intercept (random effect) in the model, you can choose from at least three basic prediction tasks that could be considered “population”:

1. Predict the population mean without taking the varying intercept into account, i.e. ignoring the between-subject variability. This could be done via the `re_formula` argument.
2. Predict for a hypothethical previously unseen subject, drawn from the same population, i.e. drawing a completely new varying intercept for each prediction from `normal(0, sd_SubjectID)`. This can be done by setting `SubjectID = "any_previosly_unseen_value"` and using `allow_new_levels = TRUE`
3. Predict for a single subject randomly chosen from the population, i.e. taking one of the fitted varying intercepts (different for each sample). This can be done using the steps in 2) and also setting `sample_new_levels = "old_levels"`

Which of those makes the most sense for you depends on the actual question you are asking, no general answer here. There are also some more exotic options which I am leaving out for simplicity.

Speaking specifically about ROC curves (which I admit I am not a big fan of - are you sure you need ROC curves?) - wouldn’t it make more sense to make predictions for the full dataset, and then calculate separate ROC curve for each posterior sample? The ensemble of curves then express the model uncertainty about the ROC curve. Whether this is sensible obviously depends on the question you are asking, so once again don’t want to force this on you.

Best of luck with your model!

1 Like

Thank you very much for your reply! :) This helped a lot, I’m gonna try to use re_formula.