Service API parameter objects


#1

[ This was originally a comment on another thread on consolidated output. ]

I’m very much behind consolidating argument groups into structures (aka “parameter objects”).

Existing service method

Here’s the existing service method for our default sampler (copied from the Wiki on consolidated output):

template <class Model>
int hmc_nuts_diag_e_adapt(Model& model, stan::io::var_context& init,
                          stan::io::var_context& init_inv_metric,
                          unsigned int random_seed, unsigned int chain,
                          double init_radius, int num_warmup,
                          int num_samples, int num_thin, bool save_warmup,
                          int refresh, double stepsize,
                          double stepsize_jitter, int max_depth,
                          double delta, double gamma, double kappa,
                          double t0, unsigned int init_buffer,
                          unsigned int term_buffer, unsigned int window,
                          callbacks::interrupt& interrupt,
                          callbacks::logger& logger,
                          callbacks::writer& init_writer,
                          callbacks::writer& sample_writer,
                          callbacks::writer& diagnostic_writer);

Parameter objects

There are obvious groupings of arguments, and what this thread is discussing is the last bundle. I think it’d help overall if we refactored these service functions into bundles of arguments. The above would look like this:

int
hmc_nuts_diag_e_adapt(
  const base_model& model,
  const inits& inits,
  const seed& seed,
  const iterations& its,
  const nuts_config& max_depth,
  const step_adapt& sac,
  const metric_adapt& mac,
  outputs& out);

model–all

  base_model& model,

inits—HMC & NUTS

( double init_radius,
  stan::io::var_context& init,
  stan::io::var_context& init_inv_metric,
  double stepsize ),

seed—all

( unsigned int random_seed,
  unsigned int chain ),

iterations–HMC & NUTS

( int num_warmup,
  int num_samples,
  int num_thin,
  bool save_warmup,
  int refresh ),

nuts_config–NUTS

( int max_depth ),

step_adapt–HMC and NUTS w. stepsize adaption

( double delta,
  double gamma,
  double kappa,
  double t0 ),

metric_adapt–HMC and NUTS w. inverse metric adaptation

( unsigned int init_buffer,
  unsigned int term_buffer,
  unsigned int window ),

outputs–HMC and NUTS

( callbacks::interrupt& interrupt,
  callbacks::logger& logger,
  callbacks::writer& init_writer,
  callbacks::writer& sample_writer,
  callbacks::writer& diagnostic_writer )

I’m pretty sure these are all safe in the sense that we’ll have the same clusters of arguments for many different samplers and optimizers. That is, each service function can be built using a few common grouped argument structures. This is less ambitious than having a top-level structure, but will support many of the same goals as the original design. Some issues with these structures:

  • the templated model can be replaced with a base class so that this can all be precompiled. As @betanalpha has pointed out, we’ll need to carefully test this to make sure it doesn’t cost any measurable efficiency. It should be a big win in terms of compile time.

  • There’s an ongoing discussion to refactor the elements of the output object.

  • The inverse metric adaptation (which is what we’re really doing) should also have the regularization parameters exposed for our estimation.

  • The var_context lets us neatly finesse the actual types of some arguments,

  • I dropped stepsize_jitter, which I think is OK at this point. Alternatively, we could include it in a group of HMC config arguments (it shouldn’t go with max_depth, which is HMC only, unless the nuts_config inherits from hmc_config).

  • num warmup and num sample iterations may be better off on their own—the other parameters in that bundle are about filtering output (removing inits, refreshing output, and thinning).

  • Putting the chain id and random seed together seems awkward, but the chain id is primarily used to identify the output chain and to advance the RNG initially so the chains use separate slices of the same RNG.

Builders for “named” args and defaults

Of course, we then need easy builders for the chunks to deal with defaults. Pretty much all of these can get defaults, and a full default construction would just be:

step_adapt sa;  // all default values

Then I’m imagining a standard builder with overloaded operator= rather than a “build” method, e.g.,

step_adapt sa = step_adapt::build().delta(1.3).t0(0.01);

That’s like named arguments in that they can come in any order and they have defaults if they aren’t used at all.

For all of this to be super flexible, it’d also be nice to have a var_context_builder, something like:

var_context vc = var_context::build().matrix("a", m).real("f".2.3); 

Not quite sure how to implement that, though.


#2

This seems to be going back to the configuration design that was rejected in lieu of the current design, but in my opinion that has always been the right path forwards.

But is it preferable to have a builder pattern or go more object oriented to allow full access to the configuration? That would allow the C++ API (and CmdStan) to better mirror the RStan and PyStan UX.


#3

There is full access once constructed. Or did you mean getters and setters?


#4

Really my question is more of why is the builder pattern preferential to instantiating a configuration object with the defaults and then letting users modify those defaults. Or even having the constructor take in a string that the constructor parses into internal mutator calls. Given the large state of any given configuration this seems more manageable to me then a builder pattern, especially if the object returned by the builder is mutable.


#5

Here are the usual general reasons to prefer builders over setters:

  1. Because it creates immutable objects that are always in valid states. This is generally agreed to be a good thing because it makes it easy to reason about programs.

  2. It lets you validate all the arguments and set defaults all at once. This means fewer methods to write if there are interactions among validity of parameters (e.g., thinning must be less than number of iterations) or interactions of defaults (e.g., warmup is half the total number of iterations).

In our particular case, there’s probably not a huge win. For (1), we can probabiliy initialize with defaults in a valid state and then validate the arguments one at a time. For (2), the whole point is to choose parameters so this isn’t an issue to start with.

I think all of the config will be known at once. One might argue that it’d be useful to have a program that sets up config one way, then modifies it a little, then runs with that. I’d rather not encourage that versus just setting up two different configs when two different configs are needed. They can share pieces.

Manageable for clients? On the services side, the builders are much more manageable. There’s just boilerplate chained setters and a single validation.

With a simple, structured API, I hope we can avoid string-based constructors.

I do think we should have a way to serialize the config of a service method, and it makes sense to do that componentwise. Deserialization would construct instances of these objects, but it wouldn’t provide the flexible construction of CmdStan’s input. The real pain with serialization and deserialization is maintaining backward compatibility as the config evolves.


#6

I thought about this a bit more and our use case here isn’t a typical C++ client usage. The interfaces will have some kind of command line and it’ll have to get converted with code that probably looks like this:

// construct and set values
dual_avg_config_builder b;
if (...delta as given as an argument with value x...)
  b.delta(x);
if (...gamma is an argument with value x...)
  b.gamma(x);
if (...kappa is an argument with value x...)
  b.kappa(x);
if (... t0 is an argument with value x...)
  b.kappa(t0);

// assign defaults and validate
dual_avg_config c = b;

// call to service method
nuts_adapt_diag_e(..., c, ...);

If the dual_avg_config instead had an implicit argument constructor for the builder, then you could just pass b in directly without the assignment and the construct and validate step wouldhappen at the point the serivce method was called:

// converts builder to config and validates
nuts_adapt_diag_e(..., b, ...);

With a setter, we’d have something like this:

// construct and set values
dual_avg_config b;
if (...delta as given as an argument with value x...)
  b.delta(x);
if (...gamma is an argument with value x...)
  b.gamma(x);
if (...kappa is an argument with value x...)
  b.kappa(x);
if (... t0 is an argument with value x...)
  b.kappa(t0);

// assign defaults and validate
b.validate();

// call to service method
nuts_adapt_diag_e(..., b, ...);

The awkwardness is that either

  • b isn’t a well-formed dual_avg_config until after validate() is called, or

  • there needs to be a valid initial state with defaults and each setter must be able to validate and update defaults.

I’d be worried about clients assuming the second option and forgetting to validate if we took the simple implementation route of assuming validate() would be called. I don’t think we want to burden each call to completely validate itself.

The goal is to have the local interface for, say, dual averaging adaptation to do the error checking, not have each service method have to duplicate the code.


#7

Right – it sounded like you were talking about a mutable object which would make the builder pattern less appropriate.

Sure, but when we have such a large configuration we’ll need to fall back to defaults so that he user doesn’t have to specify everything.

Manageable in the sense of only have to specify the configuration that the user wants to change, leaving the rest to defaults.

With the existing arguments code in CmdStan I used a whole bunch of templates to abstract away the validation code. New arguments would be specified with constraints, as well as good and bad values, so that the validation and testing are all automated. Regardless I think we’ll be able to avoid most of the programming burden for validation.


#8

I was proposing builders before I realized that the interface clients really need the mutability in order to coherently be able to compile calls in a loop.

I agree we’ll be able to make the validation of everything pretty simple.

I’ve realized that the naturally structured C++ interface isn’t really the most natural from the interface point of view. From RStan or PyStan, it makes more sense to have a single object collect flat commands and then have the C++ on the Stan side figure out where to route it. I’ll make a more coherent proposal for what I’m talking about when I get a chance.