Interface roadmap - last draft before ratification vote

I prefer fit[i] and I think most R users would find it more natural as well, but adding an iteration class method method is pretty cheap. But in this case, I think the modal number of users of it is 1, so I would prefer overloading the existing S3 method for [.

For completeness, we’ll probably distinguish between them in the class hierarchy, but they are stored the same so I don’t think the distinction makes any different to users.

@seantalts, would it be possible to take a step back and put together a big-picture roadmap? I understand that this is the result of people who were able to attend an in-person meeting, but the draft that was put up really is focused on the finer details.

Reading the top post, it’s actually really difficult to parse what Stan is focusing on the next year to be better. Perhaps this could be done by distilling this information down to the bigger picture rather than hashing out implementation details and asking for validation for that in the electorate.

(For more specific info… there are a lot of things listed, which is good, but they’re at different levels of technical information. There are multiple focuses in there, which makes the roadmap a bit confusing… are the hessian and hessian_grad_product supposed to be added to PyStan and RStan directly, but not part of CmdStan, CmdStanPy, and CmdStanR? Something like “Unifying names” seems like a big picture item, but something that’s on a different level is “Template parameter defaults.” Is there some natural big-picture item that “Template parameter defaults” can nest under? I just checked… they’re at the same header level.)

What would be most helpful to you? Could I provide some proposed updates to the draft?

Again, this is just the interfaces piece of a larger roadmap. I’ve added a few more uncontroversial things to flesh it out a bit, but it still ultimately is a small piece of a larger document.

The past two months have been all about providing proposed updates to the draft :) Please comment away! However, it sounds like everyone else is pretty close to agreement here and we could likely move to ratify this week, so this is probably a bad time to try to change significant portions of it. Ideally we would have had some input from you on content much earlier in the process.

But don’t despair - the remaining topics will have another meeting for another roadmap doc and we’ll hopefully get Math represented in the room there :)

Is the goal to ratify the roadmap piece by piece? I’m curious… what happens if this fails to be ratified? And… is this strict majority vote out of the existing electorate?

Sorry – that last post was just confusion as to what’s happening.

So far, it sounds like:

  1. there are going to be pieces of the roadmap
  2. the overall roadmap will be a collection of these pieces; there won’t be an effort to make this coherent.
  3. the roadmap is going to be ratified piece-by-piece by the electorate (it says “full vote by the electorate” so I assume this means majority needs to vote yes to accept it?)
  4. the roadmap is designed to live for one calendar year

@seantalts, is that correct? How many other docs do you envision?

If we think most users won’t even use this then I think a named method that they can just ignore or not know about is safer than overloading the indexing method for the fitted model object.

Yeah, but if we do too many things like that then there will be too many options when you tab-complete after the $ sign.

I’m hoping we can do two pieces. The number two here is purely logistical - I’m hoping we can get all tech leads meeting in at most two meetings with substantial attendance overlap. Plus, we were pretty tired of talking about roadmap by the end of the second day last time, so it’s nice to have it split up to make it a more manageable proceeding. I think we can make the final doc or the concatenation of the two docs fairly coherent, though it’s always going to be a list of separate projects.

Yes majority vote, it can live until it’s replaced and the plan is to replace it every year (that’s in the SGB’s definition of TWG Director).

I’ll add a TL;DR to the top of the document, I think you’re right that a high-level summary would be really useful.

Okay, so, I think just this one last thing to resolve - fit[i] vs fit$attribute.

I will just say that if you’re a new user and you want to know how to get a single iteration, tab completion with a fit object with overloaded isn’t going to help you. I’ve been programming in OCaml where often doc consists of editor-supported tab completion and I have to say I have no problem with a long list of methods with informative names (vs. symbols like >>= and >>| in OCaml). How do you all want to decide these things? Sounds like @andrewgelman, @jonah, and @ahartikainen might prefer the method while @bgoodri doesn’t care too much but prefers the []. @ariddell do you have a preference on overloading fit[i] vs adding a separate method for fit$iteration(i)?

Hi all. As the “user, not developer” in this discussion, it’s hard for me to say for sure what I will like before I use it. What I can say is that right now, I’m very often doing the following steps:

  1. Fitting a Stan model.

  2. Extracting a posterior summary (most typically, posterior medians, but I could want means or quantiles or single draws).

  3. Using the summary to do later calculations as with MRP or graphing fitted models. When it’s single draws, I’ll loop thru the draws.

In step 3, I want to be able to have objects that are the same size and shape as the parameters in the Stan model. So, in my running example, scalar alpha, vector beta, 2x3x4 array theta.

Right now, my code is a mess because, after running Stan, I first have to extract the objects, then I have to do awkward steps to compute posterior medians or extract draws. For example:
fit <- stan(…)
alpha_hat <- median(extract(fit)$alpha)
beta_hat <- apply(extract(fit)$beta, 2, median) # or something like that; I can never remember how to do “apply”
theta_hat <- something something, I don’t actually know how to do this without looping thru all the dimensions of theta.
And then I can use alpha_hat, beta_hat, theta_hat in calculations.

So if we don’t have these extractor functions which allow me to pull out the posterior median, or mean, or random draws, or other things, then I need tons of helper code which makes this material difficult to teach, difficult to explain, and contrary to the spirit of probabilistic programming and Bayesian workflow.

I guess this is my personal analogy to people wanting things to be “Pythonic.” I want things to be “Bayesian workflow”-thonic. And I think this will be increasingly important.

2 Likes

It’s a pretty minor distinction between fit[i] and fit$iteration(i). Does anyone have strong preferences?

How hard would it to create prototype for the interface (wrapping RStan2 + PyStan2 /(PyStan3)) and see what are the use cases?

We have some prototype code and I think @ariddell does as well. But in this case, we have 1 person in the last eight years who wants this functionality.

1 Like

Sure and still I think current extract is hard, so is probably fit['theta'] with non-user-defined functions (functions that are designed for 1 draw)

I think the one thing to consider is, do we want to have fit results run against external functions or run function against fit object

fit --> parameters for function
function --> fit

Or do we assume users should use broadcasting / apply.

I don’t think there are many functions that expect one draw to be inputted that don’t have the capacity to input all the draws. But mean(fit$theta) should definitely work and average over all iterations to produce something with the same dimensions as in the Stan program. I guess mean(fit) could pass the mean function to all parameters in fit and return a list of mean(fit$alpha), mean(fit$beta), etc.

If someone is counting, I would like to have that functionality, too. I don’t have opinion on how it should be implemented.

Sure, I mean functions outside mcmc packages. Like physics simulation etc.

Do those functions basically work like the generated quantities block, taking in one realization of the parameters from the posterior distribution and simulating the path of a particle over time?

Basically yes. Some of them can probably use broadcasting. Just that there are also other uses for mcmc draws than just common stats.

I get it. It is certainly possible to use fit[i] or fit$iteration[i] to loop over all iterations, but that isn’t what Andrew is talking about. Is there a reason to prefer fit[i] vs. fit$iteration[i] in these physics examples?