Hi,
Is there a way to call rstan::grad_log_prob or cmdstanr::grad_log_prob functions directly from Rcpp / C++? - preferably without having to transfer data to and from R and C++ every single time it’s called from within Rcpp?
Hi,
Is there a way to call rstan::grad_log_prob or cmdstanr::grad_log_prob functions directly from Rcpp / C++? - preferably without having to transfer data to and from R and C++ every single time it’s called from within Rcpp?
Bridgestan has this R Interface — BridgeStan documentation
can you use this in C++ directly? (without R just all via Rcpp?)
I found this on the website:
“Namely, you can declare a function in your Stan model and then define it in a separate C++ file. This requires passing the --allow-undefined
flag to the Stan compiler when building your model. The USER_HEADER variable must point to the C++ file containing the function definition. By default, this will be the file user_header.hpp
in the same directory as the Stan model.”
However this only works for Stan functions, not any general Stan model. So users would have to be able to write their model as a Stan function.
It would be good if there were a way to do it more generally, not just for Stan functions?
I.e. expose the internal log_prob_grad function that the Stan shared object file contains to a C++ function that anybody can use?
You could define your entire model as a Stan function. Otherwise, @roualdes @Bob_Carpenter @WardBrian might have the answer.
If you’re living in C++, you can either directly use the model class and Stan math library, or use the bridgestan C API documented here C API — BridgeStan documentation
Yeah I could - but (ideally) I want this for my R package where people can just plug in any Stan model without having to do any “extra work” - coding some models as Stan functions isn’t easy as e.g. they might have to code the unconstraied_theta → constrained_theta transforms themselves (as Stan functions), etc. This is because I need the input to the function to just be the data and a vector of unconstrained parameters, regardless of the model.
So my attempt to do it more generally so far has been half successful - I can get the log_prob but not the gradient. Here’s what i’ve done:
Eigen::VectorXd run_stan_log_prob_grad(const std::string &model_so_file,
const std::string &r_data_as_json,
const Eigen::VectorXd ¶ms) {
// load the Stan model from the .so file using BridgeStan
char* error_msg = nullptr;
const char* data = r_data_as_json.c_str();
unsigned int seed = 12345;
// Use bs_model_construct
bs_model* model = bs_model_construct(data, seed, &error_msg);
if (!model) {
std::string error = "Error loading model: ";
if (error_msg) {
error += std::string(error_msg);
bs_free_error_msg(error_msg);
}
throw std::runtime_error(error);
}
// convert Eigen::VectorXd parameters to raw array for BridgeStan
std::vector<double> params_vec(params.data(), params.data() + params.size());
// prepare space for the log probability and gradient
double log_prob_val = 0.0;
std::vector<double> grad_vec(params.size());
// calc. log_prob_grad using the BridgeStan API
int result = bs_log_density_gradient(model,
true, /// propto
true, /// jacobian
params_vec.data(),
&log_prob_val,
grad_vec.data(),
&error_msg);
if (result != 0) {
std::string error = "Error computing log_prob_grad: ";
if (error_msg) {
error += std::string(error_msg);
bs_free_error_msg(error_msg);
}
bs_model_destruct(model);
throw std::runtime_error(error);
}
/// make Eigen output combining log_prob and gradient
Eigen::VectorXd output(params.size() + 1);
output(0) = log_prob_val;
for (int i = 0; i < grad_vec.size(); ++i) {
output(i + 1) = grad_vec[i];
}
// free the model after the computation
bs_model_destruct(model);
return output;
}
// [[Rcpp::export]]
Eigen::VectorXd run_model_from_R(std::string model_so_file,
const std::string &r_data_as_json,
Eigen::VectorXd params) {
// Call the actual C++ function
return run_stan_log_prob_grad(model_so_file, r_data_as_json, params);
}
edit: the R wrapper function is there just so I can test it from R, but i’ll be primarily calling the “run_stan_log_prob_grad” function directly from C++ (however the model data will be inputted from R by the user)
At first glance that looks mostly correct, except for the lack of loading the .so (which may be complicated if you want to support *nix and Windows in the same code). What issue are you having?
Also note that if you intend to perform multiple evaluations with the same data, it is very critical for performance that you move the construction and destruction outside the core loop
Oh sorry I think I forgot to mention - so basically my output is just the log_prob (which is correct) as the first element, but the rest (i.e., the gradient) always comes out as 0’s… maybe there’s something wrong with the .so file, as getting that to work was a pain (I dont think cmdstanr automatically generates the files as shared objects?)
I can’t immediately spot an error in your code, but note that you can make it simpler by just re-using the memory from the eigen vectors:
Eigen::VectorXd output(params.size() + 1);
int result = bs_log_density_gradient(model,
true, /// propto
true, /// jacobian
params.data(), // no need for params_vec
output.data(),
output.data() + 1,
&error_msg);
This may fix the issue “by accident” by over-writing an issue we cant spot
do you know if there’s another way / a “standard” way to do what i’m trying to do? (with or without BridgeStan, but I imagine it’d be easier with it)?
Also is there some kind of option in cmdstan to automatically compile the model into so file/shared object or easily convert it to one? (if not, can you make bridgestan work without .so files?)
I’m not sure what you’re asking in your first reply — the closest thing we have to a “standard” way of getting the log density and gradient of a generic stan model is bridgestan. You could do it yourself in C++, but most of what you’d end up doing is re-writing parts of bridgestan to do it
Bridgestan’s repo has an example of how to use it in a statically linked context, and you can also use the same object file from a model for both cmdstan and bridgestan if you run the linking step for both still
Hey sorry for the confusion - I got it working now. I reinstalled R including bridgestan, and it works now. Not sure what I did different but yeah.
Just to clarify, when you say: " it is very critical for performance that you move the construction and destruction outside the core loop", does that mean - for example - that I can call the construction once at the beginning, call the model N times, then call the destruction once at the end of the sampling?
Yes, that’s precisely what to do for the best performance
Also fyi, to get it to compile at all, I had to make a “libbridgestan.so” file from the “bridgestan.o” file that comes with the R package, and then link it to a Stan model .so file (I dont think it matters which model I use - seems arbitrary) by adding the paths of both of them to PKG_LIBS - otherwise i’d get this error:
Error in dyn.load("/tmp/RtmptjdJAc/sourceCpp-x86_64-pc-linux-gnu-1.0.13/sourcecpp_817007d1b0744/sourceCpp_49.so") :
unable to load shared object '/tmp/RtmptjdJAc/sourceCpp-x86_64-pc-linux-gnu-1.0.13/sourcecpp_817007d1b0744/sourceCpp_49.so':
/tmp/RtmptjdJAc/sourceCpp-x86_64-pc-linux-gnu-1.0.13/sourcecpp_817007d1b0744/sourceCpp_49.so: undefined symbol: _Z9new_modelRN4stan2io11var_contextEjPSo
Not sure if that’s a bug or not but probably something i’m doing wrong. Oh well at least it works though.
Without knowing more about your build process it’s hard to say more, but that shouldn’t be necessary on BridgeStan’s end
For people who are trying similar ideas in the future, I just extended the bridgestan C example folder to also include an example of loading the bridgestan library at runtime, rather than linking it into your program at build: