I have a model that seems to be failing as soon as it stops the warmup phase and begins the sampling phase.
Here’s a demo of some of the output where I was using 100 iterations for warmup. The same thing happens with 250 and 1k warmup iterations. It’ll stop the warmup, try to do one normal sample, and fail silently.
Chain 3 Iteration: 98 / 1100 [ 8%] (Warmup)
Chain 1 Iteration: 101 / 1100 [ 9%] (Sampling)
Chain 4 Iteration: 99 / 1100 [ 9%] (Warmup)
Chain 2 Iteration: 95 / 1100 [ 8%] (Warmup)
Chain 2 Iteration: 96 / 1100 [ 8%] (Warmup)
Chain 3 Iteration: 99 / 1100 [ 9%] (Warmup)
Chain 4 Iteration: 100 / 1100 [ 9%] (Warmup)
Chain 4 Iteration: 101 / 1100 [ 9%] (Sampling)
Chain 2 Iteration: 97 / 1100 [ 8%] (Warmup)
Warning: Chain 1 finished unexpectedly!
Chain 3 Iteration: 100 / 1100 [ 9%] (Warmup)
Chain 2 Iteration: 98 / 1100 [ 8%] (Warmup)
Chain 3 Iteration: 101 / 1100 [ 9%] (Sampling)
Warning: Chain 4 finished unexpectedly!
Chain 2 Iteration: 99 / 1100 [ 9%] (Warmup)
Chain 2 Iteration: 100 / 1100 [ 9%] (Warmup)
Warning: Chain 3 finished unexpectedly!
Chain 2 Iteration: 101 / 1100 [ 9%] (Sampling)
Warning: Chain 2 finished unexpectedly!
Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: All chains finished unexpectedly! Use the $output(chain_id) method for more information.
2: No chains finished successfully. Unable to retrieve the fit.
Anyone seen anything like this before? I’m on cmdstanr
with cmdstan
version 2.31.0
.
Normally I would accompany this with a MWE, but it’s a relatively large model for ongoing non-shareable research. I’m hoping people might have a sense of places I could start looking.