R Package Maintainers Using Stan - Please Read

Dear R package devs using Stan… please take a moment to read on,

We are really thrilled to have created an explosion of R packages which are using rstan in order to make Stan more easily usable to everyone. Right now we count 84 reverse dependencies on CRAN for rstan and 140 for rstan/StanHeaders when including Bioconductor, which is great. At the moment, rstan is still in version 2.21.1 on CRAN dated from January 2020. This is obviously way outdated in terms of features supported as compared to the current 2.27.0 version (no reduce_sum parallelisation, no variadic ODEs… the list is long). There are many reasons as to why it is so difficult and I will explain below. For now we really urge all R package maintainers who use rstan to do two things if you have not done them already:

  1. Please use the rstantools package to manage your Stan models. This package will take care of many things for you behind the scenes in order to keep things working magically. Please refer to the vignettes on CRAN for an introduction on how to switch an existing package over. My personal recommendation is to use the command rstantools::rstan_create_package(“rstanTemplate”, rstudio=FALSE) to create a template for your package and then try once the rstantools::use_rstan() command to convert your existing package (and you should really make sure to have your NAMESPACE file be ok, see at the very end of this post).
  2. Please put the RcppParallel R package into your Imports and LinkingTo fields of your package. This is required by now as Stan uses the Intel TBB under any circumstances by now. Also ensure that during building things a call is made to RcppParallel::RcppParallelLibs() accordingly. This is handled by rtsantools if given full control (as recommended). Should your package require more control over Makvars and the like, here is a patch to OpenMX showing steps needed (Link to RcppParallel for stan math · OpenMx/OpenMx@8e0a967 · GitHub)

The above two points will make it a lot easier for us to push new versions of Stan to CRAN. I would herby encourage every R package developer using Stan do the above two things. I would also call out for help for volunteers to support others in need to port over their package should someone need help. Of course, we Stan devs will help as well, but the entire process would be much faster as a community effort obviously.

Hamada was so kind to code up a StanChecks facility, which tests R packages for compliance with all our needed requirements which we need to allow for an update of Stan (more details below). The list is accessible here and will be updated every now and then: rstan/revdep at StanHeaders_2.26 · stan-dev/rstan · GitHub

Background

The fundamental problem we have when updating Stan on CRAN is that we are effectively forced into keeping things compatible on the C++ level between Stan releases due to how things are right now. However, the Stan project only keeps compatibility on the Stan modelling language level - and not on the C++ side of things. A critical decision, which we Stan R developers somewhat regret by now, is that we have split Stan into StanHeaders (BSD licenced stan-math) and rstan (GPLed Stan services, Stan parser). Whenever now an upgrade is to be done, we have to upload one package at a time to CRAN. Hence, we first upload StanHeaders to CRAN. CRAN will then test with a mixed setup of these packages. At the moment we would then have StanHeaders 2.26 and rstan 2.21 on CRAN. Since CRAN will check if all reverse-dependencies (your R package) will still build and compile just fine in this setup. However, that is problematic as then we mix things in a way which the Stan project never intended to support. Your package will still contain C++ code generated under the old parser 2.21 and this version will be combined with a new Stan-Math library part of 2.26 StanHeaders. Before CRAN introduced their fully-automatic build pipeline it was possible to ask for an exception from the usual test one at a time package rule and upload StanHeaders together with rstan. Doing so allowed to solve a few problematic instances of Stan updates, but now this is not anymore an option.

The need to link to RcppParallel is required my now as the Intel TBB is a must have for Stan programs. The use of rstantools gives us Stan developers a lot more control over the entire build process. For example, the rstantools package will upon R package compilation always recreate the C++ sources using the Stan model as basis. Doing so allows us to ensure the latest parser on the system is used (avoiding that a super outdated parser is used for getting the C++ sources) and also allowing us to inject some tricks which are sometimes needed to get things to work.

The quoted reverse dependency check created from Hamada does on key thing for testing the “upgradability” of Stan: We will run R CMD check under a mixed setup. That is, one needs to install StanHeaders 2.21 and rstan 2.21 first. Then one needs to install StanHeaders 2.26, done via

install.packages("StanHeaders", repos = c("https://mc-stan.org/r-packages/", getOption("repos”)))

, and then run in this mixed setup (StanHeaders 2.26 & rstan 2.21) the R CMD check command for your package. There are right now a few standard issues which are present in some packages. Some of these (and their solution) are listed under the page put together by Hamada. In case your package fail is mysterious to you and you need help, then please reach out to the forum here and we will try to help you get things sorted.

I hope the above is clear enough in explaining matters sufficiently detailed. We really hope to release Stan 2.26 ASAP, but we do need your help.

Best,

Ben, Hamada, Steve, Sebastian & the entire Stan team

The NAMESPACE file of your package whenever you upgrade a package to the new rstantools logic must include these statements:

import(Rcpp)
import(methods)
importFrom(rstan, sampling)
useDynLib(your-great-package, .registration = TRUE)

Otherwise generating the roxygen2 documentation won’t work.

9 Likes

I have my package GitHub - stemangiola/ppcseq: Probabilistic outlier identification for bulk RNA sequencing data</ti on Bioconductor.

Just to make sure, are you suggesting to (?)

  1. create a new package rstantools::rstan_create_package(..ppcseq..), move my old code in the new directory/package
  2. Once done that run rstantools::use_rstan()
  3. Check the NAMESPACE
1 Like

You should do whatever works best in your case.

For me it was about building a template package from where I took over the bits into my package. In fact, I followed the vignette from rstantools and even automated building that test package. Attached is the R file for that. In the process of doing that I figured out what was important (the NAMESPACE file must contain a certain minimum).

So maybe just get started and let us know where you get stopped and how. In case all is smooth, then that’s welcome news as well.

Thanks for porting things over as asked for!

rstantools.R (1.6 KB)

2 Likes