Is there any way to output when there is a divergence or the max_treedepth is hit during sampling?
Sometimes I am trying to adjust (stan parameters or re-parameterize) a model that takes a long time to run. It would be useful to know early on if. e.g., I’m getting lots of divergences or chains hitting the max_treedepth.
I could imagine a flag to the cmdstan produced executable or some extra support in stan itself via the print statement. Then you’d need something to test (I guess in the generated quantities block? Not sure) and you could print something if the condition was true.
Is anything like this possible?
The CSV files are produced during sampling and you could read in the partial CSV file and read its
Ah! I hadn’t thought of that. That’ll work. Thanks!
How often does it write out to those? I’m running something now and one chain says it’s at 100/2000 (5%) but that (and all the CSV) files show nothing but the header so far…
Oh. Maybe warmup samples don’t show up? That would make it less useful. But there’s a way to include those right? Goes looking
You need to set
save_warmup=1 with the sample argument
./test sample save_warmup=1 num_samples=1000 …`
Just remember that divergences during warmup, even during late warmup, are healthy and expected. The adaptation aggressively probes larger step-sizes after each metric update (i.e. at the beginning of each new window), and often diverges. Even in the middle of the window, some models turn out to tend to diverge just as the step-size gets long enough to yield poor acceptance probabilities, and so you can get divergences during warmup that are just the result of the dual averaging probing values that are a bit higher than the value that it will ultimately choose.
Edit: by the same token, treedepth saturations during late warmup are also healthy and expected. When the adaptation aggressively probes larger step sizes, the dual averaging often subsequently explores step-sizes that are much too small before settling back down to a near-optimal step-size. So you can get a mixture of divergences and max treedepths for up to a few dozen iterations after metric updates, and this is perfectly healthy.
I have a development-paused package, aria, that shows a variety of during-warmup/during-sampling diagnostics.
Thanks @rok_cesnovar, @jsocolar and @mike-lawrence ! This is all very helpful.
@mike-lawrence, my workflow is not via R. Is any of what aria does possible from cmdStan?
But it may be that I what I want isn’t possible. It sounds like it’s pretty hard to tell from the early warmup if the sampling will be problematic in the sense of divergences or hitting max_treedepth.
If there were a way to do that, I think it would be useful. Especially since, in my limited experience, those very issues, cause the model to take a long long time to run. So it becomes hard to do the iteration necessary to fix things. It would be super-helpful if you could see you had a problem in a few minutes rather than a few hours.
You definitely cannot tell from early warmup unless the model is of the sort that runs super fast anyway. If the model is having problems early in warmup, it’s got the whole rest of warmup to tune the sampler to eliminate those problems. Indeed, that’s what warmup is for :)
If you’re a shell-script wizard, sure, but in that case you’ll be missing out on lots of work already done by others to do it all. I’m hoping to find time to port what aria does to a python implementation with html dashboard for use in VSCode, but that’s not going to be ready any time soon.
Not a wizard, no. But my workflow comes from Haskell–I wrote a lot of code to use stan (via cmdstan) from Haskell. Haskell is prepping the data, writing the stan code and then pulling the results back in for further use.
Calling out to R is…more complicated. So I guess I’d need to see what it would take using cmdstan. By shell-script wizard, you mean there are files to parse to know what is happening? Or something else?
Yeah, all aria does (in terms of its during-sampling info) is parse the csv files while sampling takes place.
I think I could manage that from Haskell, which is pretty good on the parsing front. I’ll take a look at aria!
Well, don’t have your expectations too high code-quality-wise; it was my first foray into OOP, and my primary goal was to get working examples of workflow features rather than robust/readable code 😬
Didn’t have any particular expectations! Just hoping I can figure out how you get from what’s in the sampler files to diagnostic output. Then I can reproduce that, I think.
I really appreciate all the help.
The model I am struggling with is presenting difficulties but mostly, I think, because I don’t know what I’m doing! So knowing something bad is happening faster would be very helpful.
I’m trying to combine election data–millions of voters and votes–with surveys–tens of thousands of voters and votes–to estimate voter turnout and preference based on state and demographic variables. I’m doing a few things that I don’t see covered much:
- combining data from different sources
- combining data with vastly different numbers of counts (which makes the binomial tricky)
- modeling turnout and preference (or any two different things) in the same model so that uncertainties propagate correctly to the post-stratification.
Anyway, I’m learning a lot and you all have been very helpful!
Very minor rebuttal: while divergences and tree depth warnings during warmup can, and often are, the result of the sampler adaptation and/or pathological geometries far outside of the typical set, they are not always. It does require some expertise, but some warning patterns (stepsize decreasing but accept state staying small indicates error in the gradient calculation, divergences with small stepsizes can indicate geometries that will be pathological no matter the final adaptation configuration, etc) can be monitored during warmup to identify real problems what will persist into the main sampling phase. Again this requires expertise, and shouldn’t be attempted lightly, but it is possible.