I just ran the bernoulli model following the tutorial notebook. I usually use CmdStan now, but may need to go back to PyStan for a specific job. I’m not sure what the original goal of CmdStanPy was, but to me it would be useful to have Python organize/process the data (easy with Pystan), run Stan inference (CmdStan allows better control), and have it back in Python for plotting, diagnostics, etc (takes some io to read it back from CmdStan output, but that’s kind of the easy part). CmdStanPy looks like it could be the best of both worlds.
So a few comments/questions:
is it possible to use a CmdStan-compiled model, and are all CmdStan arguments accessible (e.g engaged=0)?
I see that there are temporary .csv files, is there a way of saving (multiple chains) into the data folder; alternatively, since the CmdStan files are there, is it possible to extract values like the mass matrix within Python?
I haven’t used it in a while, but I think PyStan had an extract option that returned a multi-dimensional numpy.ndarray instead of a pandas.DataFrame. Not a big deal, anyway.
yes - construct a Model object specifying both arguments stan_file and exe_file
damn! the answer to that is supposed to be yes, but looks like engaged got overlooked. will file issue…
filing two more features requests for you -
yes, there should be a function which lets you specify a directory into which the per-chain csv files are saves
yes, it would be possible to write a function to extract the step size and mass matrix from a given chain.
The reason that CmdStanPy’s extract function returns a pandas.DataFrame is so that you can specify a list of parameter names and it will return a view on the DataFrame - at least that’s the intent - please test to break!
Thanks. I will soon be running some serious analyses and maybe I will have some more constructive feedback, some things like array vs data frame are mostly convenience, anyway, and this looks great, already.
the PosteriorSample object has attributes column_names and sample - the latter is the ndarray that contains the entire sample stored column_major (F format) as ( draws X chains X parameters ) and the former are the column labels.
the CmdStanPy equivalent of CmdStan’s engaged argument was renamed to do_adaptation. renamed by me and perhaps this was a bad decision. in any case, do_adaptation is a bool and so do_adaptation = True == engaged=1, and False == 0.
I’m adding args for init_buffer, window, and term_buffer which correspond to the initial fast adaptation interval, slow adaptation interval, and final fast adaptation interval.
I agree that do_adaptation is more informative than engage. In RStan this is called adapt_engaged. It seems it would be useful to have a bit more coordination between interfaces and bit more discussion in the future about the option (and function names). Related to that, there is this recent nice blog post https://djnavarro.net/post/be-some-other-name/
agreed that adapt_engaged is best, and since it’s early days, I will change it!
we are serving two masters: the devil we know and the devil we don’t. the devil we know are folks using Stan to do HMC sampling via one of the existing interfaces. for them, we should keep the names as similar as possible. the devil we don’t know are all the folks using Python but not using Stan, PyStan, or PyMC3, but who should be. for them, we want argument names that make as much sense as possible. if the existing names make as much sense as possible, no problem - do they?
to make this concrete, a spreadsheet of what each interface calls all the controls on the HMC sampler would help. I’ll try and get one together - or maybe one exists, in which case, please point me to it.
I agree, and we should be ready to change the argument names also if we realize that they are not making sense. Especially, the variational inference is labeled as experimental, so no-one should rely it to be stable even in option names…
Yes, that would be helpful and even if would not change something now, it would help when adding new interfaces, new options, or when planning Stan 3.
Yes, but they’re going to be reorganized and flattened.
It will be.
Please don’t use an engaged argument. If the number of warmup iterations is zero, there will be no adaptation. I don’t think we need a no-adaptation option that also throws away some set of initial burn-in draws. The only time we won’t want to adapt is when we’re restarting.
I think the skeleton commands in the CmdStan doc are what you want. That’s the set of options you have available from CmdStan.
The issue here is that for large models some of the chains may not optimize the sampler options equally well, so the option I found was to run long burn-in chains and manually select a step size and mass matrix from those. The following run would use them without any further optimization optimization.
I understand that for most cases adaptation is required, but the option should be accessible in case something like this is needed (or the possibly rare case where the model is parameterized for a unit mass matrix).
What I was suggesting is that we don’t need an adapt-engaged argument. If the number of warmup iterations is set to zero, there won’t be any adaptation.
You didn’t say so explicitly, but it sounds like you want to run without adaptation of warmup or step size, but still throw away some initial sequence of draws. That is, you want a BUGS or JAGS “burn in” option. My inclination would be to not support that in our interface and let users prune draws after the fact manually. The downside is that it’d require more memory for you to save everyting then throw the first half out.
One approach may be to write separate functions. That’s how everything is broken out at the C++ level. There’s a Euclidean diagonal metric version of NUTS with adaptation and a completely separate function without, and they both allow you to throw away some initial set of draws. We may go that way if we really do need to support this edge case. I’d really like to avoid the kind of tangled set of argument dependencies we’d get with an adapt-engaged argument.
Right, that makes sense, but I think warmup=0 wouldn’t work (at least in previous versions of I think PyStan), not sure why but it wouldn’t actually sample (maybe the step/mass matrix were too far off, but I think it was something else that broke). That (plus not being able to supply a step size and mass matrix) is the reason I started using CmdStan, in fact.
Not really. I just would like to be able to turn off adaptation and provide a fixed step size and mass matrix. I don’t think you have to support the entire cross-product, but it looked like some things were only possible with CmdStan.
I think it’s important to point out the importance of adaptation (like you did) and not make it “too easy” to run Stan without it (as opposed to MH, for there is no conceptual difference between burn-in and sampling phase if scale is tuned throughout). Most times CmdStan will do the job, but R/Python data frames are a lot more convenient for data processing than shell scripts, so I guess most people, myself included, will be happy to restrict bash to a minimum.