Extracting VB's precision matrix

Oh wow, if the Rhat and Neff are bad on this, then this is a bad thing.

Can you supply inits from the VB too and see how that works? The process of getting from the [-2, 2] inits to around where you want to sample can look a bit different than the sampling itself.

Or even supply inits from the regular hmc run itself.

Also I don’t like setting seeds when I do things like this. Not doing seeds leads to more variation in the experiments, but that’ll keep you from accidentally optimizing around the behavior of one seed.

You mean pass init = list(get_init(fit.vb)) to fit.inv_metric?

You mean initialise the fit.inv_metric one with samples from fit.unit_metric? I can do this, but it would (I think) defeat the purpose of fit.inv_metric in the first place, doesn’t it?

Yeah so take a posterior draw from the vb and initialize the sampling with that.

Nono, take a posterior draw from the inference you trust. Just so you know it’s starting in a good spot.

OK, starting with get_init(fit.vb) works nicely - still much faster than the unit metric, and no more Rhat/Neff issues. Still not sure how robust this workflow is, but it does seem to give significant speedups for this model and data.

If anyone tries this on another model/data, I’d be very curious to hear what worked and what didn’t.

2 Likes

Nice!

Are your functions general enough to package up? Cuz that’s step 1 in getting other people to try it.

Haha, point taken. ;-) I’ll submit a PR to stanbreaker later this week.

1 Like

FWIW, you can also pass the inits and inverse metric (or something very similar) in one step by passing the posterior means and sds (or medians and mads) as offsets and multipliers in the HMC run. This works by transforming the posterior into one whose inverse metric is all ones, which is where Stan starts by default. And then it initializes everything on [-2,2] in the unconstrained space, which transforms to the (marginals of the) posterior on the soft-constrained space.

There’s a bit of extra headache involved for parameters that need additional constraints (e.g. standard deviations, non-centered random effects, etc), but is still feasible by declaring the parameters on the unconstrained space and transforming them (and including any necessary Jacobian adjustments), and using medians and mads of the back-transformed draws for the offsets and multipliers.

1 Like