Thanks for the suggestion! I looked at the threading vignette for BRMS and it looks like it will automatically parallelize log likelihood calculation. Since the likelihood for my model is lognormal, I assume it is not very expensive (but I will confirm that with profiling). In that case, I would not expect BRMS’s approach to give me much of a speedup. That’s what the ‘simple reduce_sum’ model I posted is meant to do, but it resulted in a substantial slowdown.
The threading vignette also notes the need to play around with grainsize, so that’s probably something I need to do.
I can also try to implement some of the changes over in this thread about
How to most efficiently reduce_sum in a hierarchical logistic model.
Thanks again!