Comments and questions about building packages with rstantools

My colleague, @nschiett and I wanted to share some of our comments and questions about developing a package based on rstantools, i.e. using the function rstan_create_package.

Our package is hosted here: https://github.com/nschiett/fishflux

Questions

  • Should we keep .o, .cc, .cpp, and .h files under src/ on the package repo? rstantools (via rstan_create_package) doesn’t add those to src/ by default, and pkgbuild::compile_dll() creates a bunch of those files based on the code in inst/stan. However, when comparing our code with that of rstanarm, we notice that these file extensions are not present either. For that reason, we deliberately decided to remove them from the repo, also because adding them makes the package increasingly heavy to download from GitHub. Should they be kept in the repo, and if not, perhaps rstantools could add them automatically to .gitignore?

  • Maybe this relates to the previous question, but is there a way to avoid pre-compilation during installation? remotes::install_github("nschiett/fishflux") triggers pre-compilation, but we assumed the whole point of using rstantools was to have an entirely compiler-free package? Or did we make the make the wrong assumption?

Comments

Thanks heaps!

5 Likes

Maybe @jonah can answer this?

1 Like

Yeah, thanks for tagging me.

@dbarneche Sorry for the slow reply. I hadn’t noticed this post. I’ll try to answer your questions, but first let me say that fishflux is a great package name!

Yes to gitignore. I think that should happen now after Add generated C++ files to .gitignore and .Rbuildignore by mcol · Pull Request #66 · stan-dev/rstantools · GitHub from last month. Although I guess if you add your Stan programs after using rstan_create_package() it might not automatically gitignore the compiled files. We’ll look into that.

Good question. You’re only half wrong ;) The intention was to allow users that download your package from CRAN to have a compiler-free experience. Unfortunately install_github() has to install a package from source which means compiling everything. But once your package gets on CRAN users will be able to install the pre-compiled version, so when they do install.packages("fishflux") nothing will need to get compiled. Does that make sense? If you want to enable people to install a pre-compiled version from GitHub you could look into making binary versions of your package available on GitHub using drat.

Yeah this was our fault, but will just be temporary. There was a problem with the interaction between StanHeaders and rstantools that we didn’t anticipate and CRAN had to revert rstantools back to 2.0.0.

Hmm, the tbb library is used by Stan but I’m not sure why anything would have changed between versions of rstantools. @wds15 @bgoodri Any ideas about this?

3 Likes

TBB is an essentially unavoidable dependency but provided by RcppParallel which new versions of rstantools and rstan will bring in.

1 Like

Hi @martinmodrak, @jonah, and @bgoodri,

Thank you all very much for the explanations, this was extremely helpful!

If you don’t mind, I have one more larger question that builds on the above, but perhaps should be moved to a separate topic (happy to do it if you prefer).

I am currently co-developing another package with my colleague @beckyfisher using rstantools. The point of the package is to provide a specific set of non-linear models (about 10 different equations), in a similar way to how rstanarm makes the SSfunctions.

There are three things that we would like to allow the user to specify on top of the model type:
a) specifying a family for the response variable;
b) priors;
c) addition of an offset parameter on which they can add hierarchical effects, e.g. if \mu = a x^b, it would implement a variant like \mu = o + a x^b, with o being fixed and random, whereas a and b would be fixed only.

We started using rstantools (please see initial attempt here), but the combinations of families * non-linear equations became quite large (~50 stan models) in inst/stan, so pre-compilation during installation via install_github became prohibitive (about 1 hour to download and install).

We decided (at least for now) to implement (a–c) using brms (please see current version here) instead because it is more flexible, however it requires compilation on the go. We still would like to be able to implement (a–c) using rstantools, but the main questions then would be:

1 - Would the (eventual) download a binary version be too heavy considering a set of 50+ pre-compiled models?

2 - Is there a way to modularise the stan code by placing small bits and pieces of variants (e.g. changes to \mu formulas on link scales, changes to prior distributions) under inst/include such that the final pre-compiled product is much lighter?

Thanks again!
Best,
D

There is no way a package with 50 Stan programs would be able to go on CRAN and even without CRAN would be difficult to install. You need to use a bunch of if, else if, \dots else statements to express more models in fewer Stan files like is done in rstanarm.

3 Likes

Thanks for clarifying @bgoodri, we suspected it was going to be too heavy.

I’m also having trouble getting my RStan dependent package to build with GitHub Actions.

I get the same error as @dbarneche - a failure to link to tbb i.e.

clang++ -mmacosx-version-min=10.13 -std=gnu++14 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o rater.so RcppExports.o stanExports_class_conditional_dawid_skene.o stanExports_dawid_skene.o stanExports_grouped_data.o stanExports_hierarchical_dawid_skene.o -L'/Users/runner/runners/2.263.0/work/_temp/Library/RcppParallel/lib/' -Wl,-rpath,/Users/runner/runners/2.263.0/work/_temp/Library/RcppParallel/lib/ -ltbb -ltbbmalloc -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L'/Users/runner/runners/2.263.0/work/_temp/Library/RcppParallel/lib/''
ld: library not found for -ltbb
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [rater.so] Error 1
ERROR: compilation failed for package ‘rater’

you can see the full logs in the check results for this PR. It fails on both Mac and Linux but passes on Windows

This is pretty odd, because the package builds fine on my Mac and on Travis, which I believe uses Linux.

I don’t really know if this is fixable and there is a known workaround, but I thought that I’d just record that this is a general problem and hope that someone would have some insight.

Thanks!

@jeffreypullin thanks for sharing and sorry for the hassle. I think the fix is for us to get new versions of rstan and rstantools on CRAN that depend on RcppParallel, which includes TBB and should avoid errors like this

I know @bgoodri and @wds15 are working on getting rstan submitted. I think the version of rstantools on GitHub is ready to go but is waiting on rstan.

@jonah I was writing a long comment about this when I realized what the problem is: although rstantools has been rolled back to 2.0.0

install.packages("rstantools")

on a Mac will still install 2.1.0 because that is what the current CRAN binaries are. 2.1.0 is however incompatible with the latest StanHeaders causing the failure.

I don’t see any easy way to fix this except to wait for the CRAN binaries to update.

cc fyi @dbarneche

just install it from source on Mac. That should give you the right version… still inconvenient, of course…

This should do what @wds15 suggested:

install.packages("rstantools", type="source")

Hi @mcol, @wds15, thanks for your responses.

That’s what I’m doing currently - the problem is GitHub Actions where I can’t customize the installation type easily (or at least I don’t think I can)

Best!

1 Like

@jeffreypullin, this is our temporary fix:

4 Likes

I’m not familiar with Github actions, but perhaps it’s posible to specify the rstantools github repository as the preferred repository ahead of CRAN? This page shows an example of how to setup actions for R, and it seems you can specify a repository with “uses”: https://github.com/r-lib/actions/tree/master/setup-r (I’m just guessing though)!

I think just adding

install.packages("rstantools", type="source")
before
remotes::install_deps(dependencies = TRUE)

in the Install dependencies part of the github action script should also work. It would build from source on all OS though.

2 Likes

Hi all, thanks for all your help - it’s now building successfully! I ended up using @rok_cesnovar’s suggestion to just install rstantools from GitHub. For anyone trying to emulate this, I put the line:

remotes::install_github("stan-dev/rstantools@v2.0.0")

at the start of the install dependencies block.

3 Likes

rstantools 2.1.x is now on CRAN again, so the GitHub business should be unnecessary

Will install.packages("myownpackage.tar.gz") triggers pre-compailation? It installs my package, but it recompiles.

I packed my R and stan codes and want to share with my colleagues. I have 90 stan files in my package so it took hours to compile. I don’t want my colleagues all spending so long time to install my package.

1 Like

Yes, install.packages recompiles the package. R packages cannot contain executables or libraries, so will always (as far as I know) require compilation on installation.