Status on RStan, CRAN, and latest versions of Stan?

Hah, that was a misleading error message!

Ah, I’ll wait then

I will discuss with @rok_cesnovar - I think there have been enough minor fixes in the past week to justify a 2.30.1

1 Like

After installing the experimental version of RStan, you can use the nightly stanc3 as follows:

file.remove(system.file("stanc.js", package = "rstan"))
download.file("https://github.com/stan-dev/stanc3/releases/download/nightly/stanc.js", file.path(find.package("rstan"), "stanc.js"))

Sounds good. Thanks!

1 Like

Should RStan stan() command honour options(stanc.allow_optimizations = TRUE)? I installed the latest experimental version (with your commits today), but running stan() creates the C++ code without --O1 flag.

It should. What’s the output of the following:

formals(rstan::stanc)

and

getOption("stanc.allow_optimizations")
> getOption("stanc.allow_optimizations")
[1] TRUE
> formals(rstan::stanc)
$file


$model_code
[1] ""

$model_name
[1] "anon_model"

$verbose
[1] FALSE

$obfuscate_model_name
[1] TRUE

$allow_undefined
isTRUE(getOption("stanc.allow_undefined", FALSE))

$allow_optimizations
isTRUE(getOption("stanc.allow_optimizations", FALSE))

$standalone_functions
isTRUE(getOption("stanc.standalone_functions", FALSE))

$use_opencl
isTRUE(getOption("stanc.use_opencl", FALSE))

$warn_pedantic
isTRUE(getOption("stanc.warn_pedantic", FALSE))

$warn_uninitialized
isTRUE(getOption("stanc.warn_uninitialized", FALSE))

$isystem
c(if (!missing(file)) dirname(file), getwd())

How did you know it doesn’t honour the option? Maybe, your code isn’t affected by O1 optimizations? Also, does rstan::stanc(model_code = <stancode>, allow_optimizations = TRUE) generate different C++ code?

The generated code shows

    return std::vector<std::string>{"stanc_version = stanc3 v2.30.0", "stancflags = "};

Works with CmdStanR: with 50% drop in sampling time.

Also misses the --O1 from that line I mentioned above

I would not rely on the "stancflags = " for stancjs. This line is generated based on the presumed existence of argv from the system, which won’t be populated on stancjs

Argh

Agreed, this is an oversight we should fix

Edit: Allow driver code to set printed argument list in generated hpp by WardBrian · Pull Request #1231 · stan-dev/stanc3 · GitHub

2 Likes

I was not aware of that not being respected in stanc.js. Sorry for the misleading.

1 Like

Don’t worry. It was useful with CmdStanR, and thanks to this it will be fixed.

1 Like

When will the 01 optimizations be certified Aki-fresh?

Seriously, @avehtari, thanks for going down this rabbit hole and improving the code base!

3 Likes

I got now stan_model() with allow_optimizations = TRUE to work and see 50% speedup. I don’t have now time to investigate further or time to test stan() or rstanarm compilation.

4 Likes

I would also like to thank @hsbadr - having the experimental branch of RStan working with develop is great for stress testing these features of the stancjs interface and also testing against more complicated models which people in the R ecosystem have been building up for years

4 Likes

Yes, huge thanks to @hsbadr! He’s the hero of this story.

3 Likes

I have now succesfully tested that these work

stan_model(file=file, allow_optimizations = TRUE)

and

options(stanc.allow_optimizations = TRUE)
stan(file=file, data=data)

Two suggestions

  • In RStan when using stan_model(.,verbose = TRUE) it would be useful to print out also the stanc flags
  • The RStan model object stores the CPPFLAGS, but it would be useful to store also the STANCFLAGS

There is also an issue that RStan tries to be clever avoiding recompilation, but fails when STANCFLAGS change. I first run stan_model(file=file1, allow_optimizations = FALSE), and after that running stan_model(file=file2, allow_optimizations = TRUE), where file2 was the same as file1 except for the name and initial whitespace, doesn’t recreate C++ code but recompiles the cached C++ code with optimizations off.

I also was able to install the latest rstanarm, but not yet convinced that --O1 flag did get through.

However, I realized I can check whether rstanarm code has anything that can use SoA, and it seems there is not much speedup expected. For example, bernoulli.stan:

> model <- cmdstan_model(stan_file="bernoulli.stan", compile=FALSE);model$check_syntax(stanc_options = list("debug-mem-patterns", "O1"))
vector[z_beta_1dim__] z_beta: AoS
vector[K_smooth] z_beta_smooth: AoS
vector[smooth_sd_raw_1dim__] smooth_sd_raw: AoS
array[vector[K], hs] local: AoS
array[vector[K], mix_1dim__] mix: AoS
vector[q] z_b: AoS
vector[len_z_T] z_T: AoS
vector[len_rho] rho: AoS
vector[len_concentration] zeta: AoS
vector[t] tau: AoS
vector[K] beta: AoS
vector[K_smooth] beta_smooth: AoS
vector[smooth_sd_1dim__] smooth_sd: AoS
vector[q] b: AoS
vector[len_theta_L] theta_L: AoS
vector[0] inline_hs_prior_return_sym109__: AoS
vector[inline_hs_prior_K_sym110__] inline_hs_prior_lambda_sym111__: AoS
vector[inline_hs_prior_K_sym110__] inline_hs_prior_lambda2_sym113__: AoS
vector[inline_hs_prior_K_sym110__] inline_hs_prior_lambda_tilde_sym114__: AoS
vector[0] inline_hs_prior_return_sym102__: AoS
vector[inline_hs_prior_K_sym103__] inline_hs_prior_lambda_sym104__: AoS
vector[inline_hs_prior_K_sym103__] inline_hs_prior_lambda2_sym106__: AoS
vector[inline_hs_prior_K_sym103__] inline_hs_prior_lambda_tilde_sym107__: AoS
vector[0] inline_hsplus_prior_return_sym94__: AoS
vector[inline_hsplus_prior_K_sym95__] inline_hsplus_prior_lambda_sym96__: AoS
vector[inline_hsplus_prior_K_sym95__] inline_hsplus_prior_eta_sym97__: AoS
vector[inline_hsplus_prior_K_sym95__] inline_hsplus_prior_lambda_eta2_sym99__: AoS
vector[inline_hsplus_prior_K_sym95__] inline_hsplus_prior_lambda_tilde_sym100__: AoS
vector[0] inline_hsplus_prior_return_sym86__: AoS
vector[inline_hsplus_prior_K_sym87__] inline_hsplus_prior_lambda_sym88__: AoS
vector[inline_hsplus_prior_K_sym87__] inline_hsplus_prior_eta_sym89__: AoS
vector[inline_hsplus_prior_K_sym87__] inline_hsplus_prior_lambda_eta2_sym91__: AoS
vector[inline_hsplus_prior_K_sym87__] inline_hsplus_prior_lambda_tilde_sym92__: AoS
vector[0] inline_make_theta_L_return_sym126__: AoS
vector[len_theta_L] inline_make_theta_L_theta_L_sym127__: AoS
matrix[inline_make_theta_L_nc_sym132__, inline_make_theta_L_nc_sym132__] inline_make_theta_L_T_i_sym133__: AoS
vector[inline_make_theta_L_nc_sym132__] inline_make_theta_L_pi_sym137__: AoS
vector[inline_make_theta_L_r_sym142__] inline_make_theta_L_T_row_sym139__: AoS
vector[0] inline_make_b_return_sym145__: AoS
vector[rows(z_b)] inline_make_b_b_sym146__: AoS
matrix[inline_make_b_nc_sym149__, inline_make_b_nc_sym149__] inline_make_b_T_i_sym150__: AoS
vector[inline_make_b_nc_sym149__] inline_make_b_temp_sym153__: AoS
vector[(K + K_smooth)] coeff: AoS
vector[N[1]] eta0: AoS
vector[N[2]] eta1: AoS
vector[inline_ll_clogit_lp_J_sym180__] inline_ll_clogit_lp_summands_sym183__: AoS
vector[inline_ll_clogit_lp_N_g_sym185__] inline_ll_clogit_lp_eta_g_sym187__: AoS
vector[0] inline_pw_bern_return_sym159__: AoS
vector[inline_pw_bern_N_sym160__] inline_pw_bern_ll_sym161__: AoS
vector[inline_pw_bern_N_sym160__] inline_pw_bern_pi_sym162__: SoAvector[0] inline_pw_bern_inline_linkinv_bern_return_sym4___sym163__: AoS
vector[0] inline_pw_bern_return_sym168__: AoS
vector[inline_pw_bern_N_sym169__] inline_pw_bern_ll_sym170__: AoS
vector[inline_pw_bern_N_sym169__] inline_pw_bern_pi_sym171__: SoAvector[0] inline_pw_bern_inline_linkinv_bern_return_sym4___sym172__: AoS
vector[(p[inline_decov_lp_i_sym197__] - 1)] inline_decov_lp_shape1_sym193__: AoS
vector[(p[inline_decov_lp_i_sym197__] - 1)] inline_decov_lp_shape2_sym194__: AoS

The specific example I’ve been running is a bit sensitive to random initialization and exact prior specifications, but with logistic regression with regularized horseshoe prior and p=1536, n=54, stan_glm is now an order of magnitude slower than corresponding brms generated code. I know it’s probably difficult to change rstanarm code as it needs to be very flexible, but pinging @bgoodri so that at least he is aware of the experiment.

And thanks @hsbadr for all the help here and for all the work on getting RStan and rstanarm t owork with the latest Stan!

1 Like