30%-40% drop in sampling time using stanc O1 optimizations

brms generated code is not (yet) optimized for benefiting from --O1, so there maybe more brms models that will benefit from the --O1 later.

Were you saying that this brms code:
backend="cmdstanr", stan_model_args=list(stanc_options = list("O1"))
Might not do anything yet because brm is not set up to take advantage of this 01 option yet? I did just run even a quite simple model and it did actually run faster with this extra line added!

Would you generally suggest just having this as the default, or are there cases where this might go awry? (I’m mostly just using fairly standard models, like gaussian, ordinal categorical, often multilevel)

Really great to have these speed improvements!

At the time I posted, Paul hadn’t taken into account --O1, but some of the model code was anyway such that --O1 helps, but now Paul has also made some easy changes to the code generation so that even more models benefit. There are likely more cases where the brms code generation could be changed to take benefits from --O1, but as usual it’s a time resource issue. There are also models, that are less likely to get big speedups before additional improvements in stanc3.

Simple models are most likely to benefit from --O1. The more complex the model is, the more difficult it is to write Stan code so that --O1 will have any effect.

After the last release, there has been at least two additional bugs found, that caused either compilation to fail or sampling to crash. If I remember correctly, there hasn’t been yet bugs where compilation and sampling would run, but with wrong results. So, I think at the moment, it’s fine if people turn it on by default in their own environment, but we probably need more people to test it, before turning it on by default in CmdStan/interface releases.

1 Like

Thanks, that’s good to know. Maybe I’ll make sure I have the latest brms too then and see if there is some additional benefit. But even the small speedup already could make a big difference in simulations or models that would normally take a while to run