RStan refactor branch

syclik · October 8, 2016, 3:08am

@bgoodri and @jonah: I’m at the point where I need your help.

I’m on branch: feature/stan-services in RStan. Here’s what I’m doing to build it:

R CMD build StanHeaders/
R CMD INSTALL StanHeaders_*.tar.gz

cd rstan
make build; make install

Below is what I’ve done to test all of it. It all works, at calling the C++, but we need to get the writers in place so that we get samples pushed through to the stan fit object.

library(rstan)

stancode <- 'data {real y_mean;} parameters {real y;} model {y ~ normal(y_mean,1);}'
mod <- stan_model(model_code = stancode)
smwd <- new("stanmodel_with_data", model = mod, data = list(y_mean = 2))
init = list(); seed = 1; id = 0; init_radius = 2


## diagnose
smwd@model$diagnose(init, seed, id, init_radius, 1e-6, 1e-6)

## optimize
smwd@model$optimize_bfgs(init, seed, id, init_radius, 0.001, 1e-12, 10000, 1e-8, 10000000, 1e-8, 2000, 1, 10)

smwd@model$optimize_lbfgs(init, seed, id, init_radius, 5, 0.001, 1e-12, 10000, 1e-8, 10000000, 1e-8, 2000, 1, 10)

smwd@model$optimize_newton(init, seed, id, init_radius, 2000, 1)

smwd@model$sample_fixed_param(init, seed, id, init_radius, 1000, 1, 100)

## sample
num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; max_depth = 10
 smwd@model$sample_hmc_nuts_dense_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth)
 
num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; max_depth = 10; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10; init_buffer = 75; term_buffer = 50; window = 25
smwd@model$sample_hmc_nuts_dense_e_adapt(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth, delta, gamma, kappa, t0, init_buffer, term_buffer, window)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; max_depth = 10;
smwd@model$sample_hmc_nuts_diag_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; max_depth = 10; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10; init_buffer = 75; term_buffer = 50; window = 25
smwd@model$sample_hmc_nuts_diag_e_adapt(init, seed, id, init_radius,num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth, delta, gamma, kappa, t0, init_buffer, term_buffer, window)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; max_depth = 10;
smwd@model$sample_hmc_nuts_unit_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10;
smwd@model$sample_hmc_nuts_unit_e_adapt(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, max_depth, delta, gamma, kappa, t0)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; int_time = 2 * pi
smwd@model$sample_hmc_static_dense_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; int_time = 2 * pi; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10; init_buffer = 75; term_buffer = 50; window = 25
smwd@model$sample_hmc_static_dense_e_adapt(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time, delta, gamma, kappa, t0, init_buffer, term_buffer, window)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0;
smwd@model$sample_hmc_static_diag_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; int_time = 2 * pi; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10; init_buffer = 75; term_buffer = 50; window = 25; 
smwd@model$sample_hmc_static_diag_e_adapt(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time, delta, gamma, kappa, t0, init_buffer, term_buffer, window)

num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; int_time = 2 * pi;
smwd@model$sample_hmc_static_unit_e(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time)
 
num_warmup = 1000; num_samples = 1000; num_thin = 1; save_warmup = 0; refresh = 100; stepsize = 1; stepsize_jitter = 0; int_time = 2 * pi; delta = 0.8; gamma = 0.05; kappa = 0.75; t0 = 10; 
smwd@model$sample_hmc_static_unit_e_adapt(init, seed, id, init_radius, num_warmup, num_samples, num_thin, save_warmup, refresh, stepsize, stepsize_jitter, int_time, delta, gamma, kappa, t0)

##experimental: variational
grad_samples = 1; elbo_samples = 100; max_iterations = 10000; tol_rel_obj = 0.01; eta = 1.0; adapt_engaged = 1; adapt_iterations = 50; eval_elbo = 100; output_samples = 1000
 
smwd@model$experimental_advi_fullrank(init, seed, id, init_radius, grad_samples, elbo_samples, max_iterations, tol_rel_obj, eta, adapt_engaged, adapt_iterations, eval_elbo, output_samples)
smwd@model$experimental_advi_meanfield(init, seed, id, init_radius, grad_samples, elbo_samples, max_iterations, tol_rel_obj, eta, adapt_engaged, adapt_iterations, eval_elbo, output_samples)

The remaining items that we need to deal with:

setting up writers so the values get back to R
getting the defaults from Stan into the right parameters
wish: change the level of the functions so they apply to smwd directly and not on smwd@model. Is that possible? (so smwd@sample_hmc_nuts_dense_e_adapt(...) instead of smwd@model$sample_hmc_nuts_dense_e_adapt(...)

bgoodri · October 8, 2016, 3:48pm

We need to decide how we want to store the values in R. A lot of people dislike the current behavior which is a list of lists of arrays.

syclik · October 8, 2016, 4:04pm

That’s really a separate decision.

I’m going through now and changing the current branch to use RStan. I
really don’t want to wait around waiting for a decision and design. If
you’re able to come up with a solution over the next couple days, then we
can go with it.

bgoodri · October 8, 2016, 5:32pm

I can’t imagine something like this being decided in a couple of days, particularly not the next couple of days. We wrote down some ideas along these lines years ago

If you just want output, I would implement the CSV writer for now and read it into a stanfit object with read_stan_csv().

syclik · October 10, 2016, 3:36pm

@bgoodri, that’s the answer I expected. I’m just going to replace the existing code with something that’ll work for this branch.

What happens after sampler_command() inside stan_fit.hpp? What bit of code is executed next? I’m running into some errors downstream and I need to figure out what exactly is happening.

sakrejda · October 10, 2016, 4:06pm

Daniel, line 605 in stanmodel-class.R is where call_sampler gets called (at
least for the sampling command, vb and optim are before that), there’s
really nothing in call_sampler after sampler_command, so the next calls are
from the “sampling” method in stanmodel-class.R.

If you post the specific errors I might be able to glance at the code and
tell you what’s going on.

Krzysztof

syclik · October 10, 2016, 4:15pm

Thanks! I’ll see if I can track down where it goes after that. I’ll push a branch later.

syclik · October 10, 2016, 5:29pm

@sakrejda I’ll need some more help.

I just pushed feature/refactor-stan-services.

This is what I run:

library(rstan)
fit <- stan(model_code = "parameters { real<lower = 0, upper = 1> theta; } model { }")

I’m able to generate an nfit object in stanmodel-class.R in the setMethod("sampling", "stanmodel",...) method.

In rstan.R, when I run this line:

  sampling(sm, data, pars, chains, iter, warmup, thin, seed, init, 
           check_data = TRUE, sample_file = sample_file, 
           diagnostic_file = diagnostic_file,
           verbose = verbose, algorithm = match.arg(algorithm), 
           control = control, check_unknown_args = FALSE, 
           cores = cores, open_progress = open_progress, include = include, ...)

I get this result:

Error in do.call(cbind, attr(x, "sampler_params")) : 
  second argument must be a list

Help?! I don’t know where that’s coming from.

sakrejda · October 10, 2016, 6:50pm

Looks like line 325-326 in stanfit-class.R:

  ldf <- lapply(object@sim$samples, function(x) do.call(cbind, attr(x,

“sampler_params”)))

So do.call expects a first argument that is a function and a second
argument that is a list of arguments to the function. It calls the
function (first argument) with arguments pulled from the list (second
argument).

So object@sim$samples is supposed to have an attribute that is a list (is
it an entry in the list per chain?) and in your call it ends up not being a
list. One of the ways to make R’s errors easier on yourself is to do

options(error=recover)

before you run your code. Then when you get the error you get something
much like a backtrace and you can figure out where things went wrong
interactively. In this case you could check what kind of object the
"sampler_params" attribute of object@sim$samples really is.

Krzysztof

syclik · October 13, 2016, 1:18am

Woohoo! I’m done. For anyone able to work with branches of RStan, can you run through this branch?
feature/refactor-stan-services

If there’s any behavior that’s inconsistent with the current branch, it’d be good to know. I’ve been testing with this:

Build:

R CMD build StanHeaders/
R CMD INSTALL StanHeaders_*.tar.gz
cd rstan
make build && make install

Then from within R:

library(rstan)
## options(mc.cores = parallel::detectCores())
fit <- stan(model_code = "transformed data { int N; int n; N = 10; n = 3; } parameters { real<lower = 0, upper = 1> theta; } model { n ~ binomial(N, theta); }")
fit
get_inits(fit)

## 1. diagnose:
fit <- stan(fit = fit, test_grad = TRUE)

## 2: optimize newton
optimizing(get_stanmodel(fit), algorithm = "Newton")

## 3: optimize bfgs
optimizing(get_stanmodel(fit), algorithm = "BFGS")

## 4: optimize lbfgs
optimizing(get_stanmodel(fit), algorithm = "LBFGS")
                               
## 5: sample fixed_param
fit <- stan(fit = fit, algorithm = "Fixed_param")

## 6: sample nuts dense_e adapt_false
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "dense_e", adapt_engaged = FALSE))

## 7: sample nuts dense_e adapt_true
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "dense_e", adapt_engaged = TRUE))

## 8: sample nuts diag_e adapt_false
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "diag_e", adapt_engaged = FALSE))

## 9: sample nuts diag_e adapt_true
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "diag_e", adapt_engaged = TRUE))

## 10: sample nuts unit_e adapt_false
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "unit_e", adapt_engaged = FALSE))

## 11: sample nuts unit_e adapt_true
fit <- stan(fit = fit, algorithm = "NUTS", control = list(metric = "unit_e", adapt_engaged = TRUE))

## 12: sample static dense_e adapt_false
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "dense_e", adapt_engaged = FALSE))

## 13: sample static dense_e adapt_true
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "dense_e", adapt_engaged = TRUE))

## 14: sample static diag_e adapt_false
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "diag_e", adapt_engaged = FALSE))

## 15: sample static diag_e adapt_true
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "diag_e", adapt_engaged = TRUE))

## 16: sample static unit_e adapt_false
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "unit_e", adapt_engaged = FALSE))

## 17: sample static unit_e adapt_true
fit <- stan(fit = fit, algorithm = "HMC", control = list(metric = "unit_e", adapt_engaged = TRUE))

## 18: variational fullrank
vb(get_stanmodel(fit), algorithm = "fullrank")

## 19: variational meanfield
vb(get_stanmodel(fit), algorithm = "meanfield")

jonah · October 13, 2016, 1:24am

Cool! I can give it a try.

syclik · October 13, 2016, 9:30pm

I just pushed. The adaptation info and the elapsed time is now being propagated. That should be all the existing functionality.

jonah · October 13, 2016, 10:21pm

I was able to get it built and I ran through your test code as well as a few toy examples of my own. As far as I can tell it seems to be running smoothly!

jonah · October 14, 2016, 1:01am

Oh and yes, get_adaptation_info also seems to be working again. Thanks!

syclik · October 14, 2016, 1:12am

Yeah. I got that up and running again.

syclik · November 29, 2016, 4:12am

Bump. Can someone give me a timeline and a list of things we need to do for the refactor branch to be deemed ok?

ariddell · November 29, 2016, 11:48am

Is the “default parameters” functionality working? I’m particularly looking forward to that. There’s lots of duplicated code that will replace.

syclik · November 29, 2016, 1:45pm

Yes, but perhaps not as you expect it to work.

Defaults are implemented, but I was talked out of handling all of the dispatching from Stan. I can work that through the PyStan branch.

Topic		Replies	Views
Stan_model() return compileCode errors Modeling rstan	17	984	September 14, 2022
Compilation error (rstan) General	2	1063	July 3, 2019
Compiler Optimization Hints? RStan	7	980	June 10, 2020
Problem with C++ Compiler on Mac RStan	2	2397	October 27, 2018
R4.0.1 on Rstudio Server (linux) RStan	1	529	June 20, 2021

RStan refactor branch

Related topics