How to use Stanc3 compiler optimization flag --O --O1 in cmdstanr?

Michelle · February 15, 2022, 5:23pm

Dear Stan community,

I just read the new Stan 2.29.0 release note Release of CmdStan 2.29 – The Stan Blog and found these auto-formatter, canonicalizer and optimization are very interesting. And I was not aware of these features previously. May I ask if one can apply this in cmdstanr? like some flags when compiling the model? Except this nice demo Stan-to-C++ optimizations from @rok_cesnovar, is there a way for R or Python users to automatically apply these features? Thank you very much.

Michelle

rok_cesnovar · February 15, 2022, 7:07pm

Hi,

I apologize for the lack of a cmdstanr example. Here you go:

library(cmdstanr)

n <- 25000
k <- 10
X <- matrix(rnorm(n * k), ncol = k)
y <- rbinom(n, size = 1, prob = plogis(3 * X[,1] - 2 * X[,2] + 1))
mdata <- list(k = k, n = n, y = y, X = X)

mod_opt <- cmdstan_model("lr.stan", stanc_options = list("O1"))
fit_opt <- mod_opt$sample(data = mdata, chains = 1, refresh = 500)

This was used with the following model:

data {
  int<lower=1> k;
  int<lower=0> n;
  matrix[n, k] X;
  array[n] int y;
}
parameters {
  vector[k] beta;
  real alpha;
}
model {
  target += std_normal_lpdf(beta | );
  target += std_normal_lpdf(alpha | );
  target += bernoulli_logit_lpmf(y | alpha + X* beta);
}

rok_cesnovar · February 15, 2022, 7:10pm

Also a quick script to see the comparison in terms of speed:

library(cmdstanr)

n <- 25000
k <- 10
X <- matrix(rnorm(n * k), ncol = k)
y <- rbinom(n, size = 1, prob = plogis(3 * X[,1] - 2 * X[,2] + 1))
mdata <- list(k = k, n = n, y = y, X = X)

mod_no_opt <- cmdstan_model("lr.stan", exe_file = "no_opt_lr")
mod_opt <- cmdstan_model("lr.stan", exe_file = "opt_lr", stanc_options = list("O1"))

fit_no_opt <- mod_no_opt$sample(data = mdata, chains = 1, refresh = 500)

fit_opt <- mod_opt$sample(data = mdata, chains = 1, refresh = 500)

print(fit_no_opt$time()$total)
print(fit_opt$time()$total)

The exe_file is there so we can make 2 executables from the same model file.

The results are:
Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 1 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 1 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 1 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 1 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 1 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 1 finished in 17.8 seconds.
Running MCMC with 1 chain...

Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 1 Iteration:  500 / 2000 [ 25%]  (Warmup) 
Chain 1 Iteration: 1000 / 2000 [ 50%]  (Warmup) 
Chain 1 Iteration: 1001 / 2000 [ 50%]  (Sampling) 
Chain 1 Iteration: 1500 / 2000 [ 75%]  (Sampling) 
Chain 1 Iteration: 2000 / 2000 [100%]  (Sampling) 
Chain 1 finished in 10.6 seconds.
[1] 17.94077
[1] 10.78738

To not get everyone’s hopes too high, I would not expect as huge gains every time :)

Michelle · February 16, 2022, 3:24pm

This is amazing. Thank you very much for developing this. I also liked so much the online demo which is a nice tool for me to check what I could improve/canonicalize on my dirty Stan code. It’s also like a home tutor to learn some advanced features of Stan based on my own model. :)

May I ask if I could blindly apply this -O1 to all models or it is better to always compare?

WardBrian · February 16, 2022, 4:21pm

I would recommend comparison if you are able, especially while the optimizations are still being developed. If you run into any incorrect behavior or compilation problems when using it that don’t occur without O1, we would really love to hear about it

avehtari · February 16, 2022, 4:38pm

We have found at least one model which segfaults with -O1, so I would not use it yet blindly.

Michelle · March 1, 2022, 5:02pm

Hi @rok_cesnovar, just to feedback on what I played with this optimization flag. I could reproduce your example, no_opt took 14.9s, whereas opt took 20.3s on my machine. However, the option exe_file seems not working and giving me error like

Error in self$compile(...) : unused argument (exe_file = "no_opt_lr")

So I have to delete the .exe file and recompile every time I switch between opt and no_opt. Do you have any idea where I am wrong? I think this exe_file is very convenient and would like to use it. Thanks a lot.

Btw, I also tried on my hierarchical variate-covariate model with AR(1) residual, it took 262.8s with opt and 285.3 without opt. All tests are using the same seed.

rok_cesnovar · March 1, 2022, 5:08pm

This was added just recently. Run:

remotes::install_github("stan-dev/cmdstanr")

and then try again.

Thanks for the feedback!

Michelle · March 1, 2022, 6:11pm

It works, just it seems that I have to specify the extension as .exe otherwise it will tell me Error in rethrow_call(c_processx_exec, command, c(command, args), pty, : file not found.

Topic		Replies	Views
Stanc optimization flags CmdStan	3	498	August 4, 2023
Status on RStan, CRAN, and latest versions of Stan? RStan	43	2683	July 27, 2022
Stanc3 O1 optimisation bug Developers stanc3	2	485	November 4, 2022
First stanc3 release candidate! Developers	9	2199	August 19, 2019
Compiling fails with STAN_CPP_OPTIMS=TRUE Modeling	11	490	February 15, 2023

How to use Stanc3 compiler optimization flag --O --O1 in cmdstanr?

Related topics