Can someone talk me out of the following idea? I feel icky.
If I’m having trouble getting a model to converge using sampling, how bad of an idea is it to fit an approximate posterior using meanfield variational inference and then using the VI posterior as the sampling prior?
I think this may be a bad idea because I would be using the data twice - once to set the prior (by fitting a VI model) and once again in the likelihood. But I can’t think of a better way to set efficient priors for models with tens of thousands of parameters.