Would there be any way to have Stan exit upon getting a specific number of divergences? I would think it would be do-able on a chain-by-chain basis. Seems silly to have to sit through N number of chains running if I get 20 divergences per chain and it doesn’t tell me until the end.
Possibly after a certain number of divergences after warmup. It almost always diverges the iteration after a window change during warmup and possibly many more times as it is trying to adapt. However, you often want to let it run to the end to see where the divergences tend to occur.
I get that, but sometimes when you’re prototyping things you just want to
know if it’s working or not on a general basis. Like if you coded something
That’s an interesting idea. It’d be nice to have termination filters that’d let you specify this kind of thing. An even more ambitious plan that would use cross-chain information would terminate if the chains are moving internally but not mixing.
As is we just stream out the answer and don’t accumulate summary stats like number of divergences as we go. It would be possible, but would be some work with the current infrastructure.
If you use CmdStan or save the rstan output to a .csv file as you go you can monitor this stuff yourself. That’s not a good general answer and I like the idea but it is a work-around that’s very functional at the moment.
What we need is an online accumulator where you can monitor things like non-split R-hat and means and variances of chains. Wouldn’t be that hard with the streaming output. But it’d be hard to maintain and support cross-platform.
This has been on my to-do list with my R package ezStan, which already does some minimal processing of the sample files for my alternative progress watcher. I happen to have been working on an update this week fixing a few things and could look at adding this finally. Is there a function in rstan to detect divergences?
I’m glad this has garnered some interest. Sounds like I’m not the only one!
Will report back in a day or two.
The trick is to use Welford’s algorithm to accumulate sufficient statistics. That covers means and variances. I just added this to our new road map; I’m trying to keep it to big features we would actually like to build.
Since I’m looking at the samples anyway, I should also be able to add the time-based and ess-based termination criteria. Did you have anything in mind for how to specify/guess the number of warm-up iterations for these?
We were thinking that if we were timing, we’d try to split the time evenly between warmup and sampling.
As to the online things, I’m more worried about being able to diagnose when warmup has converged (the mass matrix and step sizes should converge and we should find the high probability mass of the posterior).
Ah, presumably by doing a few iterations first to get a feel for the time/iteration?
Is there any existing code in rstan to compute these checks?
This is a problem because we adapt as we warmup, so early warmup iterations are in the wrong space of the posterior and haven’t fully adapted step size (integration time) and mass matrix (metric). If we knew how to do this easily, it’d already be done! But we haven’t really even begun experimenting.
No. In the end we don’t really care if adaptation has converged if the chains mix well. It just seems like that it’s the only way to figure out when we should stop adapting. We could also run some real sampling in parallel and measure that at various points. That’ll give us a real read of how much time we have left and mixing from where we’re at.
Ok, then I’ll focus on merely online reporting of the existing metrics (divergences, max tree depth, ess, rhat) post-warm-up for the near term. The termination stuff feels more like it should be in Stan rather than rstan or an R helper package anyway.
We’ve gone back and forth between designs where the sampler is a simpler iterator to the one we have now where everything gets controlled through a combination of services config and callbacks.
Just pushed an update to ezStan that will show post-warmup divergences as they occur. Installation & usage demo here, let me know if you encounter any bugs!
Note, I’m really looking forward to the sample storage refactor; what I do in watch_stan to watch the sample files is a rather fragile hack.