Note: This same question was posted on Twitter, just making a Discourse thread for those that prefer this channel.
For a few different reasons that are mostly grant & papers related we are interested if you, the users of Stan and brms, are using some of the performance-related features that were added in Stan & brms recently and if you are using them, if you can tell us a bit more on how and where are you using them (the model type, what was the gain, etc.).
We are primarily interested if you are using:
- within-chain parallelization with the
reduce_sumfunction in Stan or brms (threads argument to brm)
- profiling in Stan
- OpenCL backend in Stan or brms (opencl argument to brm)
If you are using any other feature not listed that was added just recently and you find that has greatly helped your work or research, also feel free to share.