Hierarchical Multinomial Model Optimization/Specification

Paul Bürkner has built a lot of standard optimisations in brms, so I am not even going to presume to be able to do better than his implementation.

I think from just scanning the priors, the model has a lot of very broad priors which gives counter-intuitive prior distributions on the logit scale. Basically normal(0,5), if the predictors are on the unit scale, and folded cauchy priors are indicating that you expect the model to be relatively sure that a category will be chosen or not chosen. My advice would be to start with normal(0, 1) for the coefficients and exp(1) for hierarchical scales and do a prior predictive check with brms. I think you can do prior_only = TRUE in brms. You probably don’t need 4500 warmup if the model is appropriate for your data. Once you have the prior right, you can maybe start with warmup = 1000 and iter = 2000 to see whether you have any speed-ups.

The way I think about it is that if you can improve the prior with 1 day of work and it shaves off 2 days of computations that’s a win.


EDIT: Also have a look at this post that focuses on avoiding redundant computations. This might help as well.

2 Likes