Using expose_stan_functions in a package


#1

I used rstantools::rstan_package_skeleton to start a new R package based on Rstan. I would like to include some data simulating functions I wrote in stan and which I can use in R through expose_stan_functions. Is there a recommended way to do that?

Currently I write an R function like so

my_r_function = function(params){
  rstan::expose_stan_functions(stanmodels$simulate)
  mat = my_stan_rng(params)
  return(as.data.frame(mat))
}

I understand that this is not a typical use case for (r)stan so feel free to ignore my question. I am already pleased with the 4x speedup compared to my pure R data generating function.


#2

there are solutions to this on the forum as I recall. The key idea is to use the caching facility of Rcpp’s sourceCpp function. This will allow you to fish out the c++ file you are looking for.


#3

Based on this thread

and this one

and this very sage advice in a third thread

I settled on adding the following bit of code in the automatically generated stanmodels.R before rm(MODELS_HOME). I also added an extra folder src/stan/stan_functions.

# Start Addition Stijn
stan_function_files <-dir(file.path(MODELS_HOME, "stan_functions"),
                      pattern = "stan$", full.names = TRUE)
lapply(stan_function_files, function(f) {
  file_name <- sub("\\.stan$", "", basename(f))
  stan_model <- rstan::stanc(f, allow_undefined = TRUE,
                             obfuscate_model_name = FALSE)
  rstan::expose_stan_functions(stan_model,
                               cacheDir = file.path(MODELS_HOME,
                                                    "stan_functions",
                                                    file_name),
                               cleanupCacheDir = TRUE)
  }
)
Rcpp::compileAttributes()
# End Addition Stijn

The code is very much modelled after what is already in stanmodels.R. That is, it loops over all files in the new folder, exposes the functions in a separate subfolder for each file, and cleans up any old .cpp (hence the separate folders). Finally, compileAttributes gathers and links all the functions.

If I figure out how stanmodels.R gets called at installation, I make a separate file with this bit of code to keep the original file pristine.


#4

Looks good.

You can automate this through the use of the cleanup script or you dig into src/Makevars (I think). How these are used is best taken from the rstanarm package.


#5

This isn’t the neat solution I thought it was.


#6

I think a general solution is still not a solved issue. Which is fine to be honest. I don’t expect stan to have a general purpose stan language to c++ to R wrapper functionality.

See the github discussion

I made it work for my purpose, simulating data, by rewriting the function a little bit, adding a generated quantities section to generate the data and add a data section to allow for variable inputs into the data generation. My R function is than a wrapper around rstan::sampling(..., algorithm = "Fixed_param"). This approach has two advantages.

  • The data section does the input checking for me.
  • It’s actually 15% faster than the original functions.
  • It’s closer to prior predictive checking which I will use at some stage anyway.

The disadvantage is the output from rstan::sampling. I couldn’t squash everything so I used sink().


#7

From my experience is the fixed_param approach not nice for issues with the output shapes and most importantly it’s slow speed when you have lots of output. Maybe that changed in the meantime.

However, you should look forward to rstan 2.18.0 which includes a complete rewrite of the expose facility as I recall.