Schema for callback writers: one step towards retiring CSV

I think that makes mcmc_writer analogous to what Daniel is proposing for all algorithms. The proposed version just has a little more structure.

Woo hoo! You’re right. (It wasn’t originally all separated, but it is now and handled by mcmc_writer as you mention.)

So no problem with moving forward?

I’ll put up a prototype.

1 Like

As @betancourt correctly pointed out, it’s actually separated out. So actually implementing this will be a lot simpler than I had originally anticipated!

(I had thought it was all tangled, but I think I just got lost with all the indirection. The model concept for the mcmc code requires two methods: log_prob() (with Eigen::Vector) and num_params_r()).

That’s definitely one way of addressing it.

If I try to reason about not running the generated quantities block in warmup, then I think it makes sense to separate the output.

I have no problems with, and see some of the utility of, reorganizing the API so that it sends info to separate writers for

  • unconstrained parameters
  • constrained parameters
  • transformed parameters
  • generated quantities
  • sampler parameters

Getting the middle three separately will require rewriting the model class to allow each of those to be generated from the unconstrained parameters separately, and note that the immediately method of generating them independently will incur duplicated calculations that can be significant in some models (which is why they were all computed at once originally, no?).

In terms of implementation, however, I am not a fan of the proposal. Each of the above can, and in my opinion should, be abstracted to columnar data (naturally writable to data frames, dictionaries, CSV, etc) without any knowledge of what the algorithm or model will be spitting out. This abstraction will allow the same code to be applied to any algorithm (at the point where it emits a state) and will not have to be changed when the algorithms change.

I also think the algorithm configuration, adaptation, and timing info should be tackled in another issue as it requires a different architecture altogether, no matter how we do it.

That’s perfect—this is how it should be from the algorithm perspective. I thought it was more tangled than just two methods. It does put more of a burden on mcmc_writer than I thought was there, but I haven’t been following closely. That seems like an OK place to locate functionality to me.

That seems right. We want to separate the per-iteration output from the other output.

No rewrite needed. The write_array method lets you specify which ones you want. If you get them all at once, it’s cheaper. Even so, there’s some redundancy in recomputing constrained parameters and transformed parameters, which would’ve already been done in the final iteration. The recalculation is double only so we didn’t think it’d be that big of a hit.

I think this is trickier. What I’d like to see is something like an event callback handler for MCMC or for HMC or for adaptive HMC. But that drives us back into this variant typing insanity which has been at the heart of our disagreements about how to factor these 20 or so similar-yet-different ways of calling HMC.

1 Like

HMC’s 20 or so ways of calling seems like it would be as complicated as any algorithm gets. I think it’s worth a type hierarchy of writers to handle that, with some policy classes to avoid the combinatorial problem. Couldn’t be more complicated than the type hierarchy for CmdStan arguments, it should be simple code and it’s not something that would change that often.

  1. Type hierarchy with C++ inheritance

    • pro: lets us catch errors at compile time
    • con: requires multiple inheritance of implementation
    • con: complicated to change
    • con: complex code
  2. Type hierarchy with dynamic C++ pointers

    • what CmdStan does now for arguments (though now we’re talking HMC output, not config)
    • con: errors caught at run time
    • con: requires dynamic casts (would rather not)
    • con: complex code
    • pro: flexible without rebuilding type infrastructure
  3. One big handler

    • pro: only have to write one implementation
    • con: least type safe approach (or very complex type checking implementation, say that you don’t call incompatible write methods)
    • pro: most flexible approach

I was proposing something more like (3)—one big hairy HMC writer, but one that’s much flatter than the CmdStan structure.

I see that I was wrong to draw the analogy the the CmdStan config. I was thinking option #1 but with policy classes instead of multiple inheritance.

OK, so that’d make it look like (3) with some metadata. The policies would be something like +/- adaptation, { unit, diag, dense }, {NUTS / static HMC}, …? Would the idea be to make them template parameters? And they’d cause exceptions to be thrown if methods were accessed that go against policy?

I’m not too worried about policy enforcement here (unlike in say, CmdStan arguments), as this is all back-end to back-end stuff that we control.

1 Like

I think, that’s right, and better explained. I thought that type safety could be enforced by making the specific methods take their argument type from the policy s.t. you could have the diagonal mass matrix policy make the write_mass_matrix method take a a vector for it’s input.

Yes, that’s what I think of when I think policy classes.

I’m not sure when we’d need to do this but yes.

What would the policies do if not enforce the right functions being called?

I’m used to thinking of policy classes as something that injects behavior, following
the definition here: c++ - What is the difference between a trait and a policy? - Stack Overflow

So a writer class might be (ignoring correct syntax here):

class FileWriter<MassMatrixPolicy> {

   write_mass_matrix(MassMatrixPolicy::input_type x) {
     Eigen::SparseMatrix mass_matrix = MassMatrixPolicy::format_mass_matrix(x);
     write_mass_matrix_to_file(mass_matrix)
  }

};

Some features:

  • the writer class contains nothing about how to deal with a unit mass matrix, a diagonal mass
    matrix, or a full mass matrix, that’s all in the policy
  • the writer does define how to write a mass matrix to (e.g.- a CSV, a binary, a database).
  • if you try to use a MassMatrixPolicy class for a diagonal mass matrix but it’s supposed to
    be used for a full mass matrix it will throw a compile-time error.

Neat—I hadn’t thought about something like that. It’d essentially turn the writer into a tuple of writers, the types of which are determined by the policies.

I think this is in Alexandrescu’s “Modern C++ Design: Generic Programming and Design Patterns Applied”. That’s what I also wanted to do with the CmdStan arguments until I realized it overlapped with Daniel’s design 90%. You can do some cool stuff if you mix it with variadic templates so you can have fairly arbitrary numbers of policies but that might be taking int over the edge. Anyway, ending my OT contribution.

Thanks, Bob! That nails the fundamental problem on the head.

For the current HMC calls, here is where the output can differ (Bob, I think you hit most of these earlier):

  • 3 x metrics: unit (no output because it’s assumed that it’s an identity matrix), diagonal (outputs a vector representing the diagonal), and dense (outputs the full matrix)
  • 2 x sampler parameters: NUTS (stepsize__, treedepth__, n_leapfrog__, divergent__, energy__), static (stepsize__, int_time__, energy__)
  • 2 x adaptation. Either written or not.

Things that are the same:

  • sample parameters: (lp__, accept_stat__)

The draws of unconstrained / constrained parameters are the same and they just take std::vector<double>. (They’re sized differently program + data to program + data, but on any particular program + data combination, it’s the same.)

@sakrejda, thanks for showing how it can be done with policies! I was originally thinking about classes and composition, but policies work too. Policies are difficult at first and require good doc, but it’s definitely worth taking a look. There is a balance in making it easy for interfaces to call.

(Btw, the CmdStan structure was @betanalpha’s design.)

It can be so much less code! Anyway, real reason to answer was that I’m happy to help code/doc this.

Oh I see, github blamed you b/c you moved that stuff to CmdStan.

1 Like