Rather than hijacking the original topic, I’m opening a new one to discuss a radical rethinking of how we can handle output that @seantalts and I have been discussing. We brought it up briefly during this week’s Stan meeting, but I thought I’d write up what we’ve been thinking about.
Comments obviously welcome—that’s why it’s up here!
Current services output
The interfaces talk to Stan through the Stan services methods, which are in the stan-dev/stan
repo in directory src/stan/services
. There are about 20 such methods including HMC variants (NUTS/static, adaptation/not, metric shape), ADVI variants (dense and diagonal covariance), and optimization variants (L-BFGS, BFGS, Newton). Each of these 20 or so methods takes a slightly different set of callbacks, though there are a lot of commonalities.
Callbacks of this form impose a C++ burden on new implementations. The goal of the discussion in the original topic was to consolidate these C++ callback writers into some kind of object so it would be easier to add new ones without breaking backward compatibility.
This is difficult to do with appropriate type checking, etc., and the proposal has been stalled since @martinmodrak produced an original design for consolidated output and @sakrejda produced the excellent overview of what kind of I/O we’re currently doing.
Logging and other message output
We want to move our general console output to more of a logger-based style, separating debug output, informational output, warning output, and error output. We haven’t made any progress doing this, either.
Combo proposal
We can fix everything at once in a way that is embarassingly forward compatible.
- Remove the output callback arguments (nope, no replacement!)
- Define a global, static logger
- Replace calls to current callback functions with messages written to the global logger
Thread safety for multi-threaded log density evals will require a synchronized FIFO message queue to be implemented for the logger, but that only needs to be coded once, and there won’t be a huge amount of contention. And we have threading support in C++11 now.
The global logger will need to be able to handle a range of line-based message types, but each will be of the simple form:
<TAG> value1, ..., valueN
The values can be strings, booleans, integers, reals, vectors, and matrices.
For example, an informational log message could be written as
INFO "Adaptation terminated normally. Beginning sampling."
and a warning message could be:
WARN "Rejected proposed parameter value; normal_lpdf scale parameter was -1.2 and must be positive"
Our header, warmup draws, and actual draws can be:
PARAMETER-NAMES, "mu", "sigma"
...
WARMUP-DRAW, 1, 2, -3.976542874635, 1.545237485639
...
POSTERIOR-DRAW, 1, 1285, -3.222184747564, 2.974526204756
The draws have have a chain ID (and redundant iteration ID, to make things easier on the client side).
The warmup adapted metric can be something like this, when you adapt it on iteration 40, say,
ADAPTED-METRIC 40, 0.59815757, 1.4333337656535
Whether it’s a matrix or vector is obvious because you know the parameters at this point.
This is just a sketch—don’t treat it as a serious proposal.
Cons
-
slower to read a human-readable ASCII output line and convert it to a double-precision floating point value than to read out of the callbacks
-
routing logic of messages to output sources remains up to the interfaces
-
we need to rebuild a bunch of things in CmdStan, RStan, and PyStan to get back to where we are now
Pros
-
everything is ordered within chain but asynchronous across chains, so don’t need to worry about output periods (per iteration, multiple per iteration, intermittent, etc.) By design, everything’s intermittent by construction and only convention makes it more regular.
-
no longer a type puzzle to figure out where to add which kinds of output to minimize effort across service functions
-
no longer a custom pipefitting C++ interface job for the interfaces to deal with new kinds of output
-
forward compatibility is easy because interfaces can ignore output they don’t care about
New output example
As an example of how simple new output would be, consider adding trajectory information—it’d have a tag, chain ID, iteration ID, and trajectory iteration ID (may be confusing with NUTS going backwards and forwards in time!); again, we know the parameter names before we get a trajectory.
TRAJECTORY 1, 172, 1, 0.37459864683, 1.002347563523
TRAJECTORY 1, 172, 2, 0.426747582, 1.106353516464
TRAJECTORY 1, 172, 3, 0.5576465755955, 1.0776464628495
and divergences get chain IDs, iteration IDs, and position and momentum vectors, for example:
DIVERGENCE 3, 1204, <position>, <momentum>
Alternative
Instead of a human-readable, line-based UTF-8 format, we could use protocol buffers. The most recent spec allows streaming output with the kinds of structure we need. It looks like Rprotobuf (it’s part of the Rcpp suite) is a pretty full interface and I’d guess Python would have excellent protobuf support given its use at Google.
Just a Sketch!
Remember, this is just a sketch. The real tags and message formats need to be designed.