I think that makes mcmc_writer analogous to what Daniel is proposing for all algorithms. The proposed version just has a little more structure.
Woo hoo! Youâre right. (It wasnât originally all separated, but it is now and handled by mcmc_writer as you mention.)
So no problem with moving forward?
Iâll put up a prototype.
As @betancourt correctly pointed out, itâs actually separated out. So actually implementing this will be a lot simpler than I had originally anticipated!
(I had thought it was all tangled, but I think I just got lost with all the indirection. The model concept for the mcmc code requires two methods: log_prob()
(with Eigen::Vector
) and num_params_r()
).
Thatâs definitely one way of addressing it.
If I try to reason about not running the generated quantities block in warmup, then I think it makes sense to separate the output.
I have no problems with, and see some of the utility of, reorganizing the API so that it sends info to separate writers for
- unconstrained parameters
- constrained parameters
- transformed parameters
- generated quantities
- sampler parameters
Getting the middle three separately will require rewriting the model class to allow each of those to be generated from the unconstrained parameters separately, and note that the immediately method of generating them independently will incur duplicated calculations that can be significant in some models (which is why they were all computed at once originally, no?).
In terms of implementation, however, I am not a fan of the proposal. Each of the above can, and in my opinion should, be abstracted to columnar data (naturally writable to data frames, dictionaries, CSV, etc) without any knowledge of what the algorithm or model will be spitting out. This abstraction will allow the same code to be applied to any algorithm (at the point where it emits a state) and will not have to be changed when the algorithms change.
I also think the algorithm configuration, adaptation, and timing info should be tackled in another issue as it requires a different architecture altogether, no matter how we do it.
Thatâs perfectâthis is how it should be from the algorithm perspective. I thought it was more tangled than just two methods. It does put more of a burden on mcmc_writer
than I thought was there, but I havenât been following closely. That seems like an OK place to locate functionality to me.
That seems right. We want to separate the per-iteration output from the other output.
No rewrite needed. The write_array
method lets you specify which ones you want. If you get them all at once, itâs cheaper. Even so, thereâs some redundancy in recomputing constrained parameters and transformed parameters, which wouldâve already been done in the final iteration. The recalculation is double
only so we didnât think itâd be that big of a hit.
I think this is trickier. What Iâd like to see is something like an event callback handler for MCMC or for HMC or for adaptive HMC. But that drives us back into this variant typing insanity which has been at the heart of our disagreements about how to factor these 20 or so similar-yet-different ways of calling HMC.
HMCâs 20 or so ways of calling seems like it would be as complicated as any algorithm gets. I think itâs worth a type hierarchy of writers to handle that, with some policy classes to avoid the combinatorial problem. Couldnât be more complicated than the type hierarchy for CmdStan arguments, it should be simple code and itâs not something that would change that often.
-
Type hierarchy with C++ inheritance
- pro: lets us catch errors at compile time
- con: requires multiple inheritance of implementation
- con: complicated to change
- con: complex code
-
Type hierarchy with dynamic C++ pointers
- what CmdStan does now for arguments (though now weâre talking HMC output, not config)
- con: errors caught at run time
- con: requires dynamic casts (would rather not)
- con: complex code
- pro: flexible without rebuilding type infrastructure
-
One big handler
- pro: only have to write one implementation
- con: least type safe approach (or very complex type checking implementation, say that you donât call incompatible write methods)
- pro: most flexible approach
I was proposing something more like (3)âone big hairy HMC writer, but one thatâs much flatter than the CmdStan structure.
I see that I was wrong to draw the analogy the the CmdStan config. I was thinking option #1 but with policy classes instead of multiple inheritance.
OK, so thatâd make it look like (3) with some metadata. The policies would be something like +/- adaptation, { unit, diag, dense }, {NUTS / static HMC}, âŚ? Would the idea be to make them template parameters? And theyâd cause exceptions to be thrown if methods were accessed that go against policy?
Iâm not too worried about policy enforcement here (unlike in say, CmdStan arguments), as this is all back-end to back-end stuff that we control.
I think, thatâs right, and better explained. I thought that type safety could be enforced by making the specific methods take their argument type from the policy s.t. you could have the diagonal mass matrix policy make the write_mass_matrix
method take a a vector for itâs input.
Yes, thatâs what I think of when I think policy classes.
Iâm not sure when weâd need to do this but yes.
What would the policies do if not enforce the right functions being called?
Iâm used to thinking of policy classes as something that injects behavior, following
the definition here: c++ - What is the difference between a trait and a policy? - Stack Overflow
So a writer class might be (ignoring correct syntax here):
class FileWriter<MassMatrixPolicy> {
write_mass_matrix(MassMatrixPolicy::input_type x) {
Eigen::SparseMatrix mass_matrix = MassMatrixPolicy::format_mass_matrix(x);
write_mass_matrix_to_file(mass_matrix)
}
};
Some features:
- the writer class contains nothing about how to deal with a unit mass matrix, a diagonal mass
matrix, or a full mass matrix, thatâs all in the policy - the writer does define how to write a mass matrix to (e.g.- a CSV, a binary, a database).
- if you try to use a MassMatrixPolicy class for a diagonal mass matrix but itâs supposed to
be used for a full mass matrix it will throw a compile-time error.
NeatâI hadnât thought about something like that. Itâd essentially turn the writer into a tuple of writers, the types of which are determined by the policies.
I think this is in Alexandrescuâs âModern C++ Design: Generic Programming and Design Patterns Appliedâ. Thatâs what I also wanted to do with the CmdStan arguments until I realized it overlapped with Danielâs design 90%. You can do some cool stuff if you mix it with variadic templates so you can have fairly arbitrary numbers of policies but that might be taking int over the edge. Anyway, ending my OT contribution.
Thanks, Bob! That nails the fundamental problem on the head.
For the current HMC calls, here is where the output can differ (Bob, I think you hit most of these earlier):
- 3 x metrics: unit (no output because itâs assumed that itâs an identity matrix), diagonal (outputs a vector representing the diagonal), and dense (outputs the full matrix)
- 2 x sampler parameters: NUTS (stepsize__, treedepth__, n_leapfrog__, divergent__, energy__), static (stepsize__, int_time__, energy__)
- 2 x adaptation. Either written or not.
Things that are the same:
- sample parameters: (lp__, accept_stat__)
The draws of unconstrained / constrained parameters are the same and they just take std::vector<double>
. (Theyâre sized differently program + data to program + data, but on any particular program + data combination, itâs the same.)
@sakrejda, thanks for showing how it can be done with policies! I was originally thinking about classes and composition, but policies work too. Policies are difficult at first and require good doc, but itâs definitely worth taking a look. There is a balance in making it easy for interfaces to call.
(Btw, the CmdStan structure was @betanalphaâs design.)
It can be so much less code! Anyway, real reason to answer was that Iâm happy to help code/doc this.
Oh I see, github blamed
you b/c you moved that stuff to CmdStan.