Nuisance parameters prior from posterior

Yeah, in this case it’s not a cut that we want so much as we want to run a fit which only has to estimate a few tens of gene level parameters, because as the dimensions get high, with a vague prior on the nuisance parameters, the probability that the sampler gets stuck or has problems on at least one chain during warmup seems to approach 1. But with a small number of total parameters it’s generally well behaved. So then doing the run on genes we’re not interested in to get “pre estimates” of the nuisance parameters, then using those estimates for a prior on a big run for many many genes we ARE interested in will hopefully keep the sampler from failing to warm up properly.

I do think in general that we know tighter priors on nuisance parameters already helps (maybe not solves all problems but at least reduces some of them), we just want these priors to be informed by some extraneous data we happen to have before running the big inference on the main stuff of interest.

The biggest question is how to use samples from the first small run within the code for the second run to create a prior? Pretend there are maybe 10 nuisance parameters, it seems like 10D kernel density estimation would be a nightmare… we could just create say normal approximations using mean and sd of each parameter separately, but that loses all correlation information from the first run (which I’m not sure whether that’s a problem here). I suppose with something like only 10 nuisance parameters we could do a full multi-normal estimate so we could just hand the covariance matrix to the final run as data… but that doesn’t seem to scale so well to cases with more parameters.