Significant Differences between `rstan` and `rstanarm` running times


I am testing different Stan models setups with Stan models and a rstanarm model.

I am seeing a lot of differences in computation times between Stan and rstanarm. Even with the QR reparameterization with Stan.

I’ve simulated data with the following dimensions
X: 10.000 obs and 7 predictors (Matrix)
y: a linar combination of the Xs (Vector)

Here’s the setup:


  • stan.stan: Simple multivariate gaussian linear model
  • stan_std.stan: Same as stan.stan with the data standardized to mean 0 and variance/sd 1
  • stan_qr.stan: Same as stan.stan with the QR decomposition (to speed up things)


  • stan_glm(y ~ .): default parameter QR = TRUE

In all samplings, I am using the default arguments (4 chains, 2000 iterations) and I have in .Rprofile file the multicore option enabled options(mc.cores = parallel::detectCores()).

For all 3 Stan models I am seeing a run time of around ~900s (with the QR a little more fast than normal and std)

But for rstanarm I am seeing a run time of 13s!!!

What am I doing wrong? I’ve pre-compiled all models before I’ve run them.

This is my session info in a 2018 MacBook Pro 13" i7 2.7 GHz Quad-Core Intel Core i7 16GB RAM with 321GB SSD free

R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] microbenchmark_1.4-7 rstan_2.21.2         ggplot2_3.3.2        StanHeaders_2.21.0-5 rstanarm_2.21.1     
[6] Rcpp_1.0.5          

loaded via a namespace (and not attached):
 [1] splines_4.0.2      jsonlite_1.7.0     gtools_3.8.2       RcppParallel_5.0.2 threejs_0.3.3      shiny_1.5.0       
 [7] assertthat_0.2.1   statmod_1.4.34     stats4_4.0.2       yaml_2.2.1         pillar_1.4.6       lattice_0.20-41   
[13] glue_1.4.1         digest_0.6.25      promises_1.1.1     minqa_1.2.4        colorspace_1.4-1   Matrix_1.2-18     
[19] htmltools_0.5.0    httpuv_1.5.4       plyr_1.8.6         dygraphs_1.1.1.6   pkgconfig_2.0.3    purrr_0.3.4       
[25] xtable_1.8-4       scales_1.1.1       processx_3.4.3     later_1.1.0.1      lme4_1.1-23        tibble_3.0.3      
[31] bayesplot_1.7.2    generics_0.0.2     ellipsis_0.3.1     DT_0.15            withr_2.2.0        shinyjs_1.1       
[37] cli_2.0.2          survival_3.2-3     magrittr_1.5       crayon_1.3.4       mime_0.9           ps_1.3.3          
[43] fansi_0.4.1        nlme_3.1-148       MASS_7.3-51.6      xts_0.12-0         pkgbuild_1.1.0     colourpicker_1.0  
[49] rsconnect_0.8.16   tools_4.0.2        loo_2.3.1          prettyunits_1.1.1  lifecycle_0.2.0    matrixStats_0.56.0
[55] stringr_1.4.0      V8_3.2.0           munsell_0.5.0      callr_3.4.3        compiler_4.0.2     rlang_0.4.7       
[61] nloptr_1.2.2.2     grid_4.0.2         ggridges_0.5.2     rstudioapi_0.11    htmlwidgets_1.5.1  crosstalk_1.1.0.1 
[67] igraph_1.2.5       miniUI_0.1.1.1     base64enc_0.1-3    boot_1.3-25        gtable_0.3.0       codetools_0.2-16  
[73] inline_0.3.15      curl_4.3           markdown_1.1       reshape2_1.4.4     R6_2.4.1           gridExtra_2.3     
[79] rstantools_2.1.1   zoo_1.8-8          dplyr_1.0.1        fastmap_1.0.1      shinystan_2.5.0    shinythemes_1.1.2 
[85] stringi_1.4.6      parallel_4.0.2     vctrs_0.3.2        tidyselect_1.1.0  

rstan-vs-rstarnarm.R (947 Bytes) stan_qr.stan (836 Bytes) stan_std.stan (684 Bytes) stan.stan (516 Bytes)

I don’t have time to look too closely right now unfortunately, but my hunch is this is because we do some tricks in rstanarm that are probably relevant here. For linear regression models we do this

to compute the likelihood when possible. A slightly less efficient way, but more efficient than the standard coding, is to use Stan’s special normal_id_glm() function: