Guidance on prior predictive checks in ordinal model

Coefficients not sensitive to prior specification remain small when using a wide prior and I do not have a reason to believe that the coefficient of the parameters that are sensitive to prior spec could have larger impacts than the other.

My reference model is the model with all 11 predictors (some are categorical with several levels) and is likely overfitting above 7-8 parameters.

1 predictor is the most parsimonious model but I guess interpreting carefully the model with 6 predictors would be fine given the small improvement in elpd (and that the projpred elpd plot is based on models without monotonic effects) and the results make sense. Is that correct?
loo_compare

A last question would be on computing a R2. I read that bayes_R2() is probably biased for ordinal models and I saw in another post R^2 calculation for brm model with cumulative family type - #4 by andymilne that a Bayesian McKelvey-Zavoina R2 could do a better approximation?


On projpred and monotonic effect:

Will do, thanks!

I should have been more clear that I when asking about the model, I think that the prior is part of the model. The prior matters a lot whether the model with all predictors overfits or not (we did run the experiments with Noa)

What prior did you use?

For predictive purposes I’m in favor of the bigger model, for explanatory purposes it gets more difficult.

Probably. I don’t have any idea how to interpret any R2 for ordinal model.

I only modified the b_ priors in the model normal(0,1) in the reference model :

formula = extinction_cat ~ 
    cov_numeric1
  + cov_numeric2
  + cov_nominal1
  + cov_nominal2
  + cov_nominal3
  + cov_nominal4
  + cov_nominal5
  + cov_nominal6
  + mo(cov_ordinal1)
  + mo(cov_ordinal2)
  + mo(cov_ordinal3)
  , data = df_extinction
  , family = cumulative("logit")
  , prior = c(prior(normal(0,1), class = "b"))

This is not a bad choice, but due to small levels (both in target and predictors) having so few observations, the predictive performance is also prior sensitive. I did run a few test runs, and I would say you should not trust any single prior or report results based on just a single prior, but illustrate the sensitivity. Prior sensitivity is common with rare events, and then you can’t avoid dealing with the complications of data not being informative alone.

Thank you for your suggestion.
I will then report the main results with prior(normal(0,1) and discuss the sensitivity in supplementary.
Is there a way to integrate this sensitivity into one final result? With some kind of model averaging maybe?


I will mark your latest reply as the “Solution” but all the posts in this thread were really helpful! Thank you all!

1 Like

Yes, but then you still probably have sensitivity with respect to the distribution over the different prior parameter values. Really, the problem is having target and predictor levels with 0 observations, which makes the log predictive density and parameter posteriors weakly identified by likelihood. It is still possible that some of you quantities of interest and conclusions are not sensitive. So far you have told you are interested in parameter posteriors and model selection, but it would be interesting to know what is the actual scientific question?

Please post also here some of the results on how do you end presenting the sensitivity results as this is interesting example and I’d like to learn more about this example as ity might help developing better advice in future

I’m interested in explaining extinction risk of species (response variable) relative to the biology/ecological traits of species (all predictors).

I will! (I think I will add the powerscale sensitivity and some discussion in supplementary)