@bgoodri is proposing to change the way Stan services report iterations from an iteration-based update to a time-based update. The proposal is to change the behavior of refresh = N
to now mean every N
seconds. The proposal is a default of 30s. So maybe an initial report (?) and then the next update at 30s by default. @andrewgelman has been requesting behavior like this.
There’s been ongoing discussion on the issue tracker for Ben’s PR, but I thought I’d bring it up here, which is where we should be having this discuss. So I’m going to move over my last summary from the issue; the quoted bits are from Ben.
The backwards compatibility standard you are applying is inconsistent with the standard of backward compatibility that has been applied in the past.
It’s OK for us to change standards. What we need to do is make them more clear at any given time.
I do not think the right thing to do is try to enforce bit-level release-over-release backward compatibility.
In addition to the chain ID hokey pokey, we have changed the number of digits in the print output, changed the way n_eff is calculated, and are changing the way R_hat is calculated. In each of those situations, we didn’t introduce another flag that defaults to false in order to keep doing things the old way unless the user flips it on; we forced people to do things the new way because it is better.
It might be helpful to separate out the concerns here.
I think the ID “hokey pokey” was a bug that got through due to insufficient testing.
The n_eff thing changed the estimator, not what’s being estimated. So nothing in the API changes meaning.
The R-hat replacement’s a work in progress. If it ever does come up, I will be OK deprecating R-hat support and adding a second diagnostic that we recommend. I’m also OK changing the default output in various ways.
My concerns about this particular pull request are:
-
Backward compatibility
refresh=N changes meaning from N iterations to N seconds -
Non-deterministic behavior
With the same seed, get different number of iteration updates.
I’m OK with (2) but wanted to check with @wds15, since he’s been the one that’s been most constrained by “reproduciblity” at this level. My guess is that you (Ben) are right and moving to a non-deterministic behavior is OK.
I’m less OK with (1), which is why I think the simple expedient of allowing control by both iterations and time makes sense. Like with potential R-hat replacements, I think the responsible thing to do would be to deprecate the per-iteration method and add the timed method and then remove per-iteration in a major release.
The progress reports is even less consequential than those things. The only harm is if someone has an old script where they specify refresh = 1000 or something in order to keep the progress output minimal and then they run their old script again, which gets reinterpreted as “refresh every 1000 seconds”, then the progress output will be even more minimal than what they might have been expecting based on Stan’s past behavior. But that isn’t a big deal.
I agree the danger is minimal here. The “harm” in my mind is the perceived (and actual) instability of our interfaces.
I’d like to hear from more devs here than just the two of us on this issue before we decide how to proceed within stan-dev/stan
.
The way we’ve set up project governance, @bgoodri will have the final say on the behavior of RStan. But what we’re talking about here is the stan-dev/stan
services, which I believe @syclik is managing. So I’d very much like to get @syclik’s sign off on a change this big before merging.