Schema for callback writers: one step towards retiring CSV

sakrejda · August 12, 2017, 9:08pm

I think that makes mcmc_writer analogous to what Daniel is proposing for all algorithms. The proposed version just has a little more structure.

syclik · August 13, 2017, 1:35am

Woo hoo! You’re right. (It wasn’t originally all separated, but it is now and handled by mcmc_writer as you mention.)

So no problem with moving forward?

I’ll put up a prototype.

syclik · August 13, 2017, 1:51am

As @betancourt correctly pointed out, it’s actually separated out. So actually implementing this will be a lot simpler than I had originally anticipated!

(I had thought it was all tangled, but I think I just got lost with all the indirection. The model concept for the mcmc code requires two methods: log_prob() (with Eigen::Vector) and num_params_r()).

syclik · August 13, 2017, 1:55am

That’s definitely one way of addressing it.

If I try to reason about not running the generated quantities block in warmup, then I think it makes sense to separate the output.

betanalpha · August 13, 2017, 10:52pm

I have no problems with, and see some of the utility of, reorganizing the API so that it sends info to separate writers for

unconstrained parameters
constrained parameters
transformed parameters
generated quantities
sampler parameters

Getting the middle three separately will require rewriting the model class to allow each of those to be generated from the unconstrained parameters separately, and note that the immediately method of generating them independently will incur duplicated calculations that can be significant in some models (which is why they were all computed at once originally, no?).

In terms of implementation, however, I am not a fan of the proposal. Each of the above can, and in my opinion should, be abstracted to columnar data (naturally writable to data frames, dictionaries, CSV, etc) without any knowledge of what the algorithm or model will be spitting out. This abstraction will allow the same code to be applied to any algorithm (at the point where it emits a state) and will not have to be changed when the algorithms change.

I also think the algorithm configuration, adaptation, and timing info should be tackled in another issue as it requires a different architecture altogether, no matter how we do it.

Bob_Carpenter · August 14, 2017, 12:22pm

That’s perfect—this is how it should be from the algorithm perspective. I thought it was more tangled than just two methods. It does put more of a burden on mcmc_writer than I thought was there, but I haven’t been following closely. That seems like an OK place to locate functionality to me.

Bob_Carpenter · August 14, 2017, 12:27pm

That seems right. We want to separate the per-iteration output from the other output.

No rewrite needed. The write_array method lets you specify which ones you want. If you get them all at once, it’s cheaper. Even so, there’s some redundancy in recomputing constrained parameters and transformed parameters, which would’ve already been done in the final iteration. The recalculation is double only so we didn’t think it’d be that big of a hit.

I think this is trickier. What I’d like to see is something like an event callback handler for MCMC or for HMC or for adaptive HMC. But that drives us back into this variant typing insanity which has been at the heart of our disagreements about how to factor these 20 or so similar-yet-different ways of calling HMC.

sakrejda · August 14, 2017, 12:52pm

HMC’s 20 or so ways of calling seems like it would be as complicated as any algorithm gets. I think it’s worth a type hierarchy of writers to handle that, with some policy classes to avoid the combinatorial problem. Couldn’t be more complicated than the type hierarchy for CmdStan arguments, it should be simple code and it’s not something that would change that often.

Bob_Carpenter · August 14, 2017, 1:04pm

Type hierarchy with C++ inheritance
- pro: lets us catch errors at compile time
- con: requires multiple inheritance of implementation
- con: complicated to change
- con: complex code
Type hierarchy with dynamic C++ pointers
- what CmdStan does now for arguments (though now we’re talking HMC output, not config)
- con: errors caught at run time
- con: requires dynamic casts (would rather not)
- con: complex code
- pro: flexible without rebuilding type infrastructure
One big handler
- pro: only have to write one implementation
- con: least type safe approach (or very complex type checking implementation, say that you don’t call incompatible write methods)
- pro: most flexible approach

I was proposing something more like (3)—one big hairy HMC writer, but one that’s much flatter than the CmdStan structure.

sakrejda · August 14, 2017, 1:09pm

I see that I was wrong to draw the analogy the the CmdStan config. I was thinking option #1 but with policy classes instead of multiple inheritance.

Bob_Carpenter · August 14, 2017, 1:21pm

OK, so that’d make it look like (3) with some metadata. The policies would be something like +/- adaptation, { unit, diag, dense }, {NUTS / static HMC}, …? Would the idea be to make them template parameters? And they’d cause exceptions to be thrown if methods were accessed that go against policy?

I’m not too worried about policy enforcement here (unlike in say, CmdStan arguments), as this is all back-end to back-end stuff that we control.

sakrejda · August 14, 2017, 1:46pm

I think, that’s right, and better explained. I thought that type safety could be enforced by making the specific methods take their argument type from the policy s.t. you could have the diagonal mass matrix policy make the write_mass_matrix method take a a vector for it’s input.

sakrejda · August 14, 2017, 1:47pm

Yes, that’s what I think of when I think policy classes.

I’m not sure when we’d need to do this but yes.

Bob_Carpenter · August 14, 2017, 2:42pm

What would the policies do if not enforce the right functions being called?

sakrejda · August 14, 2017, 2:46pm

I’m used to thinking of policy classes as something that injects behavior, following
the definition here: c++ - What is the difference between a trait and a policy? - Stack Overflow

sakrejda · August 14, 2017, 2:54pm

So a writer class might be (ignoring correct syntax here):

class FileWriter<MassMatrixPolicy> {

   write_mass_matrix(MassMatrixPolicy::input_type x) {
     Eigen::SparseMatrix mass_matrix = MassMatrixPolicy::format_mass_matrix(x);
     write_mass_matrix_to_file(mass_matrix)
  }

};

Some features:

the writer class contains nothing about how to deal with a unit mass matrix, a diagonal mass
matrix, or a full mass matrix, that’s all in the policy
the writer does define how to write a mass matrix to (e.g.- a CSV, a binary, a database).
if you try to use a MassMatrixPolicy class for a diagonal mass matrix but it’s supposed to
be used for a full mass matrix it will throw a compile-time error.

Bob_Carpenter · August 14, 2017, 3:13pm

Neat—I hadn’t thought about something like that. It’d essentially turn the writer into a tuple of writers, the types of which are determined by the policies.

sakrejda · August 14, 2017, 3:34pm

I think this is in Alexandrescu’s “Modern C++ Design: Generic Programming and Design Patterns Applied”. That’s what I also wanted to do with the CmdStan arguments until I realized it overlapped with Daniel’s design 90%. You can do some cool stuff if you mix it with variadic templates so you can have fairly arbitrary numbers of policies but that might be taking int over the edge. Anyway, ending my OT contribution.

syclik · August 14, 2017, 4:41pm

Thanks, Bob! That nails the fundamental problem on the head.

For the current HMC calls, here is where the output can differ (Bob, I think you hit most of these earlier):

3 x metrics: unit (no output because it’s assumed that it’s an identity matrix), diagonal (outputs a vector representing the diagonal), and dense (outputs the full matrix)
2 x sampler parameters: NUTS (stepsize__, treedepth__, n_leapfrog__, divergent__, energy__), static (stepsize__, int_time__, energy__)
2 x adaptation. Either written or not.

Things that are the same:

sample parameters: (lp__, accept_stat__)

The draws of unconstrained / constrained parameters are the same and they just take std::vector<double>. (They’re sized differently program + data to program + data, but on any particular program + data combination, it’s the same.)

@sakrejda, thanks for showing how it can be done with policies! I was originally thinking about classes and composition, but policies work too. Policies are difficult at first and require good doc, but it’s definitely worth taking a look. There is a balance in making it easy for interfaces to call.

(Btw, the CmdStan structure was @betanalpha’s design.)

sakrejda · August 14, 2017, 5:10pm

It can be so much less code! Anyway, real reason to answer was that I’m happy to help code/doc this.

Oh I see, github blamed you b/c you moved that stuff to CmdStan.

Topic		Replies	Views
Status of new writer? Developers	17	1438	March 8, 2018
Status of the IO re-factor? Developers	16	1186	September 18, 2020
Notes on Stan Output Serialization Options (YAML, Protobuf, Avro, CBOR) Developers	13	3025	July 14, 2021
Protobuf style guides Developers	3	2426	May 22, 2018
Output specification, yet again Developers	2	655	January 12, 2017

Schema for callback writers: one step towards retiring CSV

Related Topics