We had an unusually productive discussion today on Stan 3, mostly from the perspective of RStan. You can install a somewhat working version of the provisional rstan3 package from GitHub with
devtools::install_github("stan-dev/rstan", ref = "develop", subdir = "rstan3")
which sits on top of rstan 2.15.1 from CRAN. Much later I will make a branch that can be merged into rstan and then rstan3 will be deleted.
I also updated the wiki but if you want to follow what rstan3 is doing, it is probably best to look at https://github.com/stan-dev/rstan/blob/develop/rstan3/R/rstan.R .
You can currently do
library(rstan3)
J <- 8
y <- c(28, 8, -3, 7, -1, 1, 18, 12)
sigma <- c(15, 10, 16, 11, 9, 11, 10, 18)
config <- stan_config("8schools.stan") # wait a long time for it to compile
config$data(J, y, sigma)
built <- config$build(seed = 12345) # or maybe <- stan_build(config)
post <- built$sample()
This reflects some but not all of the opinions that were expressed this morning. I don’t love stan_config
because I don’t think users mentally associate the data the are passing to Stan as “configuration” and things that they do mentally associate with “configuration” (like max_treedepth
) would not be specified until later. Anyway, config
is pretty much a var_context, and there was a minor open question as to whether config
should have a field for the PRNG seed that could be set like config$seed <- 12345
or if seed should be argument to the build
method, which is instantiates the thing on the C++ side.
There were also differing opinions as to whether $sample
should be a method of built
or a method of some other class that only exposes methods for algorithms. In that case, it could go like
algo <- built$algo()
post <- algo$sample()
and built
would only expose low-level methods like log_prob
. I don’t think enough R users would call the low-level methods to justify making everyone the rest of them take an extra step.
One of the bigger open questions is whether to make the names of the estimating functions in the interface correspond exactly to the C++ functions or to make the names shorter and the argument list longer. For example, $hmc_nuts_diag_e_adapt()
vs. $hmc()
with defaults diag = TRUE, adapt = TRUE, metric = "Euclidean"
. With long names, it is possible to enumerate exactly the arguments that are accepted by the estimation function. With short names, some R arguments could render other R arguments moot, as in $hmc(adapt = FALSE, delta = 0.9)
. Either way, $sample()
is basically an alias for default MCMC, which is what 99% of R users would be calling. Also, there is the question of what to do about degrees of adaptation. In some cases, the user just wants to pass an initial guess of the mass matrix but adapt it, other times fix the mass matrix but adapt the stepsize, and other times fix both the mass matrix and the stepsize.
There was also a lot of discussion about the output, but that can mostly be figured out later. In the interim, it would be good if there were
- a virtual class that is inherited from rather than
prob_grad
and requires things likelog_prob
to be specified by the inherited class - a static method that could be called on the compiled but uninstantiated C++ object that would return the types of the things declared in the
data
block of a Stan program - a method that could be called on the instantiated C++ object that would return the types, sizes, and block of things declared in the
parameters
,transformed parameters
, andgenerated quantities
block - more progress on the var_context refactor
Ben