Improving Stan sampling speed

Hi all,
I have a very complex model that I’m running using multithreading in a server (where I cannot really start installing stuff). I read @avehtari’s blog post: Options for improving Stan sampling speed – The Stan Blog
but I’m a big confused on how to actually implement this. I think one should add these optimization options in cpp_options, but cmdstanr doesn’t seem to check them, so I cannot know if they are right, and if they worked.

My conclusion is that I should use

cmdstan_model("model.stan" ,
                          cpp_options = list(stan_threads = TRUE,
                                             O = 1,
                                             stan_cpp_optims = TRUE,
                                             stan_no_range_checks=TRUE))

Is this correct? Should this work on every system? Is there anything else that I could do that plays well with multithreading? I don’t care if I increase compilation time, the model takes a day and half to finish, so I’ll be happy to do anything that improves sampling time.

This is probably unintentional, you will generally want O = 3 for the C++ compiler. This is also the default for Stan, so you don’t need to specify anything for it.

Aki separately discusses the Stan compiler setting, which only supports optimization levels 0 (default) or 1, so that is where --O1 could be used.

To answer your more general questions, the STAN_CPP_OPTIMS setting uses a lot of flags which may or may not be supported, depending on the exact version of your compiler. If they aren’t supported, the C++ compiler will just refuse to build, so it’s not unsafe to try it.

These arguments are all case-sensitive I believe. If you want to check that they worked, you can also use the cmdstanr::cmdstan_make_local function, which will write out the results to $CMDSTAN/make/local and will then apply them to every build you do with that installation of cmdstan. You can also manually edit that file with the things Aki’s post suggests

2 Likes

ok, thanks!

I made a mess with “cmdstan_make_local” because it didn’t let me remove things. But I found that there is a local.example file with many things that could be directly uncommented.

The only thing I added was the following


STANCFLAGS += --O1

CXXFLAGS+= -march=native
CXXFLAGS+= -mtune=native

so far it’s running!