CmdStanPy NUTS sampler argument names adapt_delta vs target_accept_rate

the CmdStanPy wrapper is still very beta. I’m fixing some problems with the initial implementation of the sample function which wraps the call to CmdStan. CmdStan’s command line syntax uses nested keywords, e.g. adapt delta=0.99 vs. RStan’s flattened structure adapt_delta=0.99.

The problem is that adapt_delta is not a very descriptive name. It’s described here in the Reference Manual: https://mc-stan.org/docs/2_19/reference-manual/hmc-algorithm-parameters.html

Step Size Adaptation Parameters Table The parameters controlling step size adaptation, with constraints and default values.

parameter description constraint default
delta target Metropolis acceptance rate [0, 1] 0.8
gamma adaptation regularization scale (0, infty) 0.05
kappa adaptation relaxation exponent (0, infty) 0.75
t_0 adaptation iteration offset (0, infty) 10

Of this set of parameters, there’s only guidance on delta. It would be nice to rename delta something like target_accept_rate - which is already too long.

Although this might be a better name, would it be better to just go with the existing name because that way no one is confused and things line up across interfaces (as closely as possible)?

This is the only reason I’d be hesitant to change it (I agree with you that adapt_delta is not an informative name). Unless there’s a plan for all the interfaces to change to the new name, in which case maybe we should change it and the timing of the name change should be coordinated?

the problem is that the more popular the interface, the more important it is to maintain backwards compatibility, and therefore names have to stay the same.

CmdStan isn’t very widely used, and CmdStanPy is in Beta - in theory that should free us up to experiment. But it makes it more difficult to reuse existing teaching materials, case studies, etc.

There’s a conservative plan to make CmdStan’s command line arguments flatter, which would make configuring a CmdStan run a lot easier. Names will have to change at that point.

A related question: would it be OK in CmdStanPy to not expose arguments gamma, kappa and t0 altogether?

They are definitely rarely used. I guess if needed they could just be exposed later on.

I didn’t find any docs on how to use them in the manuals.
The Stan Reference Manual has a paragraph on how and why to change delta.
No explanation or guidance on the hows and whys for gamma et al.

Ok yeah in that case I definitely don’t think they need to be exposed from python at this point.

All of the adapt parameters, aside from the buffer and window parameters, are configurations of the dual averaging of the step size. They take their names from the conventions Matt Hoffman introduced in the original No-U-Turn sampler paper, http://jmlr.org/papers/volume15/hoffman14a/hoffman14a.pdf. \delta sets the objective while \mu, \gamma, t_{0}, and \kappa configure the adaptation itself. Dual averaging is somewhat sensitive to these parameters and changing them without blowing things up requires care and experience with dual averaging/stochastic optimization/etc, but their availability does provide the opportunity for an expert to be able to tweak the step size adaptation routine.

thanks, that’s very clear.

CmdStanPy is trying to be a lightweight interface to CmdStan.
not everything is going to be exposed - e.g. - only the NUTS
sampling algorithm, not the other samplers.
given that, I’m going to go with adapt_delta and not expose
the other 3 params.