R Package Maintainers Using Stan - Please Read

Dear R package devs using Stan… please take a moment to read on,

We are really thrilled to have created an explosion of R packages which are using rstan in order to make Stan more easily usable to everyone. Right now we count 84 reverse dependencies on CRAN for rstan and 140 for rstan/StanHeaders when including Bioconductor, which is great. At the moment, rstan is still in version 2.21.1 on CRAN dated from January 2020. This is obviously way outdated in terms of features supported as compared to the current 2.27.0 version (no reduce_sum parallelisation, no variadic ODEs… the list is long). There are many reasons as to why it is so difficult and I will explain below. For now we really urge all R package maintainers who use rstan to do two things if you have not done them already:

  1. Please use the rstantools package to manage your Stan models. This package will take care of many things for you behind the scenes in order to keep things working magically. Please refer to the vignettes on CRAN for an introduction on how to switch an existing package over. My personal recommendation is to use the command rstantools::rstan_create_package(“rstanTemplate”, rstudio=FALSE) to create a template for your package and then try once the rstantools::use_rstan() command to convert your existing package (and you should really make sure to have your NAMESPACE file be ok, see at the very end of this post).
  2. Please put the RcppParallel R package into your Imports and LinkingTo fields of your package. This is required by now as Stan uses the Intel TBB under any circumstances by now. Also ensure that during building things a call is made to RcppParallel::RcppParallelLibs() accordingly. This is handled by rtsantools if given full control (as recommended). Should your package require more control over Makvars and the like, here is a patch to OpenMX showing steps needed (Link to RcppParallel for stan math · OpenMx/OpenMx@8e0a967 · GitHub)

The above two points will make it a lot easier for us to push new versions of Stan to CRAN. I would herby encourage every R package developer using Stan do the above two things. I would also call out for help for volunteers to support others in need to port over their package should someone need help. Of course, we Stan devs will help as well, but the entire process would be much faster as a community effort obviously.

Hamada was so kind to code up a StanChecks facility, which tests R packages for compliance with all our needed requirements which we need to allow for an update of Stan (more details below). The list is accessible here and will be updated every now and then: rstan/revdep at StanHeaders_2.26 · stan-dev/rstan · GitHub

Background

The fundamental problem we have when updating Stan on CRAN is that we are effectively forced into keeping things compatible on the C++ level between Stan releases due to how things are right now. However, the Stan project only keeps compatibility on the Stan modelling language level - and not on the C++ side of things. A critical decision, which we Stan R developers somewhat regret by now, is that we have split Stan into StanHeaders (BSD licenced stan-math) and rstan (GPLed Stan services, Stan parser). Whenever now an upgrade is to be done, we have to upload one package at a time to CRAN. Hence, we first upload StanHeaders to CRAN. CRAN will then test with a mixed setup of these packages. At the moment we would then have StanHeaders 2.26 and rstan 2.21 on CRAN. Since CRAN will check if all reverse-dependencies (your R package) will still build and compile just fine in this setup. However, that is problematic as then we mix things in a way which the Stan project never intended to support. Your package will still contain C++ code generated under the old parser 2.21 and this version will be combined with a new Stan-Math library part of 2.26 StanHeaders. Before CRAN introduced their fully-automatic build pipeline it was possible to ask for an exception from the usual test one at a time package rule and upload StanHeaders together with rstan. Doing so allowed to solve a few problematic instances of Stan updates, but now this is not anymore an option.

The need to link to RcppParallel is required my now as the Intel TBB is a must have for Stan programs. The use of rstantools gives us Stan developers a lot more control over the entire build process. For example, the rstantools package will upon R package compilation always recreate the C++ sources using the Stan model as basis. Doing so allows us to ensure the latest parser on the system is used (avoiding that a super outdated parser is used for getting the C++ sources) and also allowing us to inject some tricks which are sometimes needed to get things to work.

The quoted reverse dependency check created from Hamada does on key thing for testing the “upgradability” of Stan: We will run R CMD check under a mixed setup. That is, one needs to install StanHeaders 2.21 and rstan 2.21 first. Then one needs to install StanHeaders 2.26, done via

install.packages("StanHeaders", repos = c("https://mc-stan.org/r-packages/", getOption("repos”)))

, and then run in this mixed setup (StanHeaders 2.26 & rstan 2.21) the R CMD check command for your package. There are right now a few standard issues which are present in some packages. Some of these (and their solution) are listed under the page put together by Hamada. In case your package fail is mysterious to you and you need help, then please reach out to the forum here and we will try to help you get things sorted.

I hope the above is clear enough in explaining matters sufficiently detailed. We really hope to release Stan 2.26 ASAP, but we do need your help.

Best,

Ben, Hamada, Steve, Sebastian & the entire Stan team

The NAMESPACE file of your package whenever you upgrade a package to the new rstantools logic must include these statements:

import(Rcpp)
import(methods)
importFrom(rstan, sampling)
useDynLib(your-great-package, .registration = TRUE)

Otherwise generating the roxygen2 documentation won’t work.

10 Likes

I have my package GitHub - stemangiola/ppcseq: Probabilistic outlier identification for bulk RNA sequencing data on Bioconductor.

Just to make sure, are you suggesting to (?)

  1. create a new package rstantools::rstan_create_package(..ppcseq..), move my old code in the new directory/package
  2. Once done that run rstantools::use_rstan()
  3. Check the NAMESPACE
1 Like

You should do whatever works best in your case.

For me it was about building a template package from where I took over the bits into my package. In fact, I followed the vignette from rstantools and even automated building that test package. Attached is the R file for that. In the process of doing that I figured out what was important (the NAMESPACE file must contain a certain minimum).

So maybe just get started and let us know where you get stopped and how. In case all is smooth, then that’s welcome news as well.

Thanks for porting things over as asked for!

rstantools.R (1.6 KB)

2 Likes

Thanks,

I have done the edits GitHub - stemangiola/ppcseq at adapt-to-new-stan

If anyone could confirm that this is correct would be amazing.

1 Like

I don’t see this line in the R/*-package.R:

#' @importFrom RcppParallel RcppParallelLibs CxxFlags

other things look good to me from skimming over it.

Thanks for switching!

2 Likes

Thanks, I will apply to my other packages.

Do you think you will ever be able to make 100% of packages switch? I guess this is the requirement for updating CRAN on your side.

(and thanks for taking the time and energy to push this)

1 Like

No, I don’t think that we will manage to get them all switched… but being able to show that we made an effort should help us making the case with CRAN that we push updates through regardless. In that case downstream packages would vanish from CRAN.

(but maybe, hopefully we find a workaround… a lot of pain for the devs, but we need to get out of the situation we are in right now)

I was left a bit unclear about what should be done if a package has been created with some version of rstantools back in the day. Afterwards I have manually added RcppParallel to Imports and LinkingTo. And why do we need

import(Rcpp)
importFrom(rstan, sampling)

for? I don’t have them and have not had any problems generating documentation using roxygen2. The package is here if you want to check it.

I am not sure how to ensure this:

and also don’t know what this

is for and whether everyone needs it or was it just for stemangiola.

This forum probably doesn’t reach all maintainers. CRAN policy site says

  • If an update will change the package’s API and hence affect packages depending on it, it is expected that you will contact the maintainers of affected packages and suggest changes, and give them time (at least 2 weeks, ideally more) to prepare updates before submitting your updated package. Do mention in the submission email which packages are affected and that their maintainers have been informed. In order to derive the reverse dependencies of a package including the addresses of maintainers who have to be notified upon changes, the function reverse_dependencies_with_maintainers is available from the developer website.

so should you also send an email to all maintainers to get more people to do the porting?

1 Like

I think you need these… for sure the Rcpp thing.

Every package needs that.

I think @bgoodri has written lots of emails as of now. This forum really should reach everyone posting R packages based on Stan…but I agree that this is probably not the case.

We also advertised this post on the Twitter account a few times.

1 Like

I’m restructuring this issue to track the roadblocks for rstan 2.26. I’ve gone through the reverse dependencies and imports from the CRAN page and tested their building against the new RStan and StanHeaders, with the updated rstantools from this PR.

The failures against the 2.21 and 2.26 combo, and the additional failures for the 2.26 and 2.26 combo are below (the full list of tested packages is in this spreadsheet)

RStan 2.21 & StanHeaders 2.26

Package PR Opened
densEstBayes No public repo - Maintainer contacted
dfpk 17-Sep
MADPop 24-Sep
MetaStan 24-Sep
ProbReco 24-Sep
publipha 24-Sep
RxODE 24-Sep
ssMousetrack 24-Sep
stanette No public repo - Maintainer contacted
trialr 24-Sep
visit 24-Sep
RxODE 29-Sep

RStan 2.26 & StanHeaders 2.26

Package PR Opened
beanz 17-Sep
cbq 27-Sep
idem 24-Sep
rstanarm 24-Sep
nlmixr 29-Sep
multinma Needs updated Stanc3 release due to bug
rmdcev Needs updated handling of standalone functions in rstantools
lgpr Needs updated handling of standalone functions in rstantools
15 Likes

Wow, this is fantastic Andrew! Hats off!

2 Likes

This is great, @andrjohns! Could you link the GitHub repo/PR to the package name in your list?

1 Like

A new version of rstan and StanHeaders v2.26.4 is now available to install via:

remove.packages(c("StanHeaders", "rstan"))
install.packages("StanHeaders", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

This version of StanHeaders is compatible with the CRAN version of rstan. It also builds static TBB and shouldn’t need RcppParallel dependency. So, please test it and report any issues.

3 Likes

@andrjohns @bgoodri @wds15 I’ve updated the revdep results for StanHeaders v2.26.4. Now, only 2 packages fail to install (25 failed with v2.26.3) and 4 packages failed in tests. Some of the remaining issues are segfaults or current errors on CRAN.

3 Likes

Great! I’ve already opened a PR over in the ProbReco github and emailed the stanette maintainer with what needs to be changed, so there’s not much more we can do for those

1 Like

Yup, that’s great! And with your effort to inform the maintainers, CRAN shouldn’t complain about 6 (out ~90) packages if we’ve already suggested the required changes weeks before the release.

But, we need to test the current version because static TBB is a major change. As we discussed in the last call, it’s good to use the compatible TBB headers and code from Math, but static TBB might cause hidden issues.

3 Likes

I was wondering how it was going with CRAN. Do we have any time estimate for the update?

Thanks!

1 Like