Pre-posterior distributions in Stan?

I am evaluating a particular study design on the basis of average posterior variance and related statistics, which is essentially to examine the frequency properties of the design. The aim being to evaluate the probability of achieving a certain level of precision given various design characteristics like sample size, numbers of clusters etc. I do this on the basis of the pre-posterior distributions, so I (i) simulate data from the model and prior distributions in R, (ii) send the data to Stan to sample from the posteriors (with same priors), (iii) get whatever statistic I need, and then (iv) repeat. Obviously doing this on a large number of occasions is slow with MCMC especially if you want to look at looks of different design parameters. Variational approximations (like meanfield) are no good as they seem terrible at approximating the posterior variance. So my question is: is there any way this could be done faster within Stan or otherwise? I don’t expect so, but thought it best to see if so!

1 Like

What you are describing seems to be Simulation Based Calibration (SBC), in which case about the only way to speed things up is to use more cores, which is what the sbc package in RStan does.

Thanks @bgoodri . I’ve not heard it called that but yes looks like the same thing although for slightly different purpose. I’ve got RStan running multicore - I think the only way to possibly improve is have Stan run only one core and then parallelise the data simulation and sending to Stan and each core only having one chain. Would having multiple instances of Stan like this be possible (if it is even worth it)?

No, if you’re already running one chain per core, that’s the best you can do. I’d advise against using only single chains per simulated data set as that breaks some useful convergence diagnostics.

3 Likes