Parameter sweep methods

I have a model which samples well with NUTS, but takes about ~10 hours to run. There are a few (6) hyperparameters that I’m providing by hand, as trying to infer them has had a strong negative effect on Neff/s.

I’d like to tune these, and use the best values for later stages of this project, which will use NUTS sampling exclusively. My question is if the estimates of optimization or variational methods are a useful for search/comparison if I do sampling afterwards?

It’s hard to say what’s going on without more details, but it looks like it might be a problem with your model. I get that sense because I’ve found that models that take a fair while to fit with low effective sample size (Neffs) can (but not always) be indicative of Stan having troubles with the model as currently written.

If you’re using RStan try using some of the graphical diagnostics, like pairplots. This paper (https://arxiv.org/abs/1709.01449) has lots of good examples of how to use them. One other suggestion would be to pare down your data to a small simulated dataset that you can use to test and debug. 10 hours is a long time to wait!

1 Like

Indeed, I am somewhat new to model inversion and Stan, but I’ve done tens of itérations on the model design, and right now, I just want to tune & fix parameters for which I don’t have good priors and without which can’t be sampled effectively.

Hence my question if optimization or variational inference is a useful approximation to guide parameter search.

The fact that your parameters aren’t being sampled efficiently might mean they don’t matter that much (or they introduce a non-identifiability which doesn’t matter or something). I’d just pick a few, run the fits, and see if your conclusions are changing any. Probly look at posterior predictives n’ such to see if everything is in the ballpark. Maybe your solutions just aren’t sensitive to the parameters – in which case who really cares what they are (and that’s a nice place to be)? If they are sensitive, it’s back to the modeling probly.

Eh, I’d only switch to those if I had a model that was sampling really well and I wanted to see if I could get it to go fast.

This is a good, if anticlimactic, point. I guess we wanted to infer some of these parameters even if the model & data don’t allow for it, so in that case more work on the model is required.