Analysis API


For example, I think ess could have the signature:

double ess(std::vector<const double*>& x, int size);

And I see why the writers would be useful, so maybe:

double ess(std::vector<const double*>& x, int size, writer& error, writer& warning, writer& log);

For this to really work, the interfaces need to store each variable within a chain in a contiguous block of memory. I think we could require that.


I’m not sure we need the writers for ess. They add a lot of complexity.

Could we use exceptions instead? Or return a std::pair with the first
value being the result and the second value being a string containing
the error message (or empty if no error). (This is a pattern Go uses.)


Yes, was already thinking that exceptions could handle exceptional events. I don’t think ess has any output other than the return (we don’t need any writers), but other diagnostic functions might?


Nothing that couldn’t be returned and then passed to a specialized writer method.


Yes, the double* for each chain would be contiguous, but they wouldn’t be contiguous with each other.

I’d much prefer exceptions if that’s workable from the interface perspectives.

If there’s an error, the value won’t make sense. Returning both the value and error message is then clunky as only one will ever get used. This error code return is also the tradition in C with functions with signatures of the form int foo(..., result_t* result) where the return is a return code indicating error or not and the function setting result with the actual value calculated.


Exceptions are well-supported in Python (Cython) and R (Rcpp). The Stan
library already uses them in a few places.


Unfortunately the conversation has diverged a bit here. Let me try to bring it back a bit.

  1. We do not throw exceptions to the interfaces for the algorithm API. This thread is not about redoing the API but adding additional routes for common diagnostic calculations. If there is interest in changing the client relationship and how error information is propagated then a new thread should be started.

  2. Diagnostic functions return various statuses depending on the thresholds we set (and possibly modifiable by the user). Okay, Warning, and Error messages do not indicate that the function has failed but rather that the input fit information is good or suspect. Exceptions would be more appropriate for failures to calculate the diagnostics.

  3. The question introduced in this thread is designing a uniform interface for all diagnostic functions that is sufficiently robust for current (and immediate future diagnostics) that is compatible with the algorithm API.



I’m not exactly sure what you mean by “algorithm API,” but it doesn’t quite matter because both the algorithm code in C++ and the services (C++ API for the interfaces) both throw exceptions.

For example, see src/stan/services/sample/hmc_nuts_dense_e_adapt.hpp line 85.

We could handle that a different way and catch all exceptions, but right now, we handle std::domain_error within the methods and allow any other exceptions to propagate.

All good points, but I was thinking about messages in conjunction with exceptions. Of course, I’m assuming that writers can either flush immediately or buffer and flush appropriately even when handling exceptions.

For warnings, I was thinking that there might be non-exceptional cases that don’t warrant failure of computing a value, but should indicate some warning. Of course, we could just say it either works or doesn’t and the only way to communicate output from one of these functions is through the message in the exception.

Just need some clarification… what are you calling the “algorithm API”?


That doesn’t sound like it should be an exception. But it also doesn’t seem like the return should be Error when it’s not an error.

For example, an exception should be thrown if the chains aren’t all the same length passed to R-hat (until we generalize, that is). We shouldn’t throw an exception if R-hat is above some nominal operating threshold like 1.1.


I think this would be o.k. in a writer or logger if we add typed messages. Otherwise an interface (e.g.- ShinyStan) will need to parse text messages in order to decide if it wants to show a different sequence of plots when models don’t converge. No reason typed messages can’t decay to text but the inverse doesn’t work.


These functions need to return indicators as to whether or not the diagnostic has passed or not so that users can programmatically check if the diagnostics pass instead of having to parse through the returned messages. Okay, Warning, and Error return codes refer to the diagnostic output, not anything to do with internal operation. Exceptions are fine for the execution of the diagnostic itself failing.


Sounds like we’re fine with the signature for simple functions like
ess. That’s progress.

As for the more complex diagnostic function, I see @betanalpha’s point.
This is tricky. Clearly we need to return different things depending on
what happens. And, equally clearly, there are multiple ways of doing this.

I’d like to add to the list of options returning a single string, which
is a JSON-encoded mapping of the relevant information. Since all
conceivable interfaces support parsing JSON and Boost supports writing
it, this could be a simple solution. It would, in my opinion, be simpler
than having multiple writers.


We already have a reasonable client/API relationship for passing messages in the algorithm API and I don’t see any reason to change that here. Let’s maintain a relatively uniform interface here, and if there’s interest in changing it then we can have a separate discussion to change everything uniformly.


Programmatic status codes sound good.


I’d like to try to round up the scope of what we have to support current functionality in a client-facing API. To summarize, I’m totally with Michael on reorganizing this into an interface client API.

Where I’m not sure there’s agreement is with

  • whether this is an I/O based interface (protobuf, etc.) or whether it’s a C++ API or whether it’s the former layered on top of the latter,

  • how much of the functionality should go in the math lib, and

  • who will manage it (related to where it goes, I suppose).

From the current client perspective, I think a bunch of functions do depend on chain structure:

  • R-hat
  • n_eff
  • mcmc std error

Then there are some functions that can take consolidated output across chains:

  • posterior quantiles
  • posterior means
  • posterior std. dev.

Things that I’m unclear about whether they need chain identities and what the output looks like are:

  • BFMI (???)
  • divergences (binary flag per draw)

That was just for HMC, of course.


+ treedepth exceedences

How about optimizing and ADVI? They return set of independent draws from the approximative parametric distribution. When combined with importance sampling there would additional importance weights. Relevant diagnostic functions

  • k-hat
  • n_eff
  • IS std error

and some summary functions as for mcmc

  • posterior quantiles
  • posterior means
  • posterior std. dev.


Thanks for the summary, @Bob_Carpenter. And thanks for adding to the list, @avehtari. Some of my responses.

For these methods, it should just be a C++ API. Earlier

in the thread, I believe we agreed that the input would be:

std::vector<double*> draws, std::vector<size_t> sizes

(the vector of sizes for generality; I think we could also have a convenience method that accepts std::vector<double*>, size_t.)

I think for the most part, the returns are straightforward, so we should use return values as the result and exceptions as exceptions to computing the function at hand.

For example, I think the R-hat function signature should look like (name can change, of course):

double split_r_hat(std::vector<double*> draws, std::vector<size_t> sizes);

There is an alternative return here, which could be something like std::pair<double, std::vector<double>> if we wanted to return the overall R-hat and then also one per chain, but I’m looking at the list and most of the things will either return a single value or one per chain, which is easy to return.

I believe we already have buy-in for these input types from CmdStan (me), RStan (@bgoodri), and PyStan (@ariddell).

I’d suggest these live in the stan-dev/stan repo. They’re not basic math functions. They’re functions that operate on the output of the algorithms, which are in that repo.

Once place they could go is under src/stan/services/analysis/ or something else under src/stan/services/.

I think functions like autocorrelation() might make sense to live inside the math library.

Hopefully by simplifying the function signatures, it won’t need to be managed much. A lot of it hasn’t needed to change over the years.

I think the arguments can be the same. For convenience, perhaps we offer an additional function signature (the second one):

std:vector<double> quantiles(std::vector<double*> draws, std::vector<size_t> size, std::vector<double> probs);
std:vector<double> quantiles(double* draws, size_t size, std::vector<double> probs);

Thoughts? I think the steps are to build one, integrate it all the way through, then build out the rest.


I believe that everyone agreed on a basic std::vector based API which will make it straightforward to build new routes.

The first routes will be for MCMC/HMC and mirror the functionality in Right now we’re still iterating on the k_hat diagnostic, hence the lack of movement on the API itself.

The code should stay in the stan repo, unless there are modular bits that would be useful to be expose to the Stan language and I propose that this functionality fall under the algorithms management.

Once we have the basic MCMC routes up then we can consider routes for other algorithms as @avehtari suggested. If there is interest in establishing an example sooner rather than later then I can work with volunteers to get some of the more established routes up (neff, rhat, divergences, etc).


There are two k_hat diagnostics:

  • one for Monte Carlo error for each parameter (MC SE doesn’t make sense if true variance might be infinite)
  • one for joint posterior distributional approximations like ADVI and Laplace (optimizing)


Thanks for the summaries, @betanalpha and @syclik. I think this means we’re in agreement on the big picture and just need to flesh out details.

I agree that starting with HMC makes sense in terms of exact design, but it might still help to take an inventory of what’s going to be needed for all the algorithms so we make sure the basic infrastructure has the right coverage.

As for things we’re still working out the kinks of, I’d rather not put those in Stan. I want the standard for going into Stan to be something that we’ll support going forward for at least five or six years. This stops us from pulling the rug out from under users, who very much resent changes in releases.

Now, if there’s something we’re trying to estimate, there’s no reason we can’t put in our best shot at an estimator now and then try to make it better. That’s essentially the path we’re taking with n_eff, and that’s a good thing. So the key thing is to settle on the interfaces, not the precise implementations or estimators.