Models where Stan outperforms Nutpie/Walnuts

I think it would be incredibly valuable to add more models to posteriordb. It has a lot of small models, and a few very big but close to intractable models. Adding more big …

I agree. Not sure whether this is the right place to chime in, but I’ve been excited at the idea of speeding up Stan using Nutpie, Walnuts, etc. For pretty much all the models I would want to reach for this, though, I haven’t actually been able to get these to do better and in most cases they don’t converge or have lower ESS/draw.

These models I’m referring can have 1+GB compressed data with several 100K parameters. So I think quite large for Stan models.

The Stan models converge but the Nutpie or Walnuts implementations would not. In a few models I was able to get Walnuts to converge, and although its time per draw was up to 2x Stan, it’s ESS/draw was far lower so I would need to double or more the iterations to get the same effective draws. I don’t have a strong enough grasp of the why and I’m not able to share these models, unfortunately, so having much larger complex models to test against would hopefully be a step in understanding what I’m seeing that doesn’t seem to match the performance charts floating around.

This isn’t a baseball model by any chance? I still have scars from some of those. They break everything.

I’m definitely interested to hear more about that, but if you can’t share the model that of course makes it a bit tricky. Maybe you can open a separate thread with what info you can give? (and ping me there).

One thing I’ve seen before what that nuts was choosing a too short trajectory length for some reason. nutpie has an extra_doubling argument that you could set to 1 or so, which will make it run one extra tree doubling after it detected a u-turn. It would be interesting if that fixes the problem. Does stan get good ess (let’s say > 300) for all of the parameters?

Ah, those baseball models were a few of the models I thought of!

But it’s consistent with others, like most recently I have a large soccer model, hierarchical in several ways, 6 likelihoods, parameter sharing across them, 86 leagues of data, etc. Same issue.

In each of these cases ESS from Stan is fine.

If it is the trajectory being too short, one of the main differences in result between Stan and Nutpie is that Stan is more conservative in finding an optimal tree depth, and Nutpie is more aggressive in making the trajectories shorter, which is where much of the speed up comes from, to over simplify.

I could set extra_doubling, and if it fixes it, it would also make the speed per draw more similar to Stan. Which means that in the models that Nutpie succeed in posteriordb there is something about the geometry and model structures where Stan’s approach is too conservative, and in the models I’m running, Nutpie is too aggressive. But I don’t know of an obvious pattern to generalize.

I had also forked Flatiron Walnuts repo to give myself extra tuning parameters, one of which is a continuous --unit-mass to float between the two ideas but I haven’t had a lot of time to actually investigate:

Feel free to move this into a new thread if this is too adjacent to the Sparse Nuts.

I think this discussion is interesting and could go on for some time, so I have split it off