Importance Sampling within Stan


I have been given the task of implementing importance sampling within a Stan model. I am investigating Poisson Change point models (inhomogeneous poisson process where the rate follows a step function). Previously I have followed the technique used in the manual (Users Guide, 8.2), which involves marginalising out the latent parameters which describe when each change point occurs. In my case I have been using continuous time rather than discrete but the results are analogous to those in the manual. For a model with N change points, an N dimensional grid is formed in relation to the data points, and log probabilities are calculated within each cell. In higher dimensions (3 or more change points) I have had memory issues, due to the increased complexity with each change point added, as mentioned in the manual.

I am now attempting to use importance sampling to avoid the computational issues that arise from marginalising 3 or more parameters. For each sampling iteration, I hope to generate say 1000 random values uniformly from the space of possible change points, and take the mean of the weighted pdfs, as an approximation of the posterior. My feeling is that this will not be possible, as Stan only allows random number generators in the transformed data or generated quantities blocks, neither of which are executed in each sampling iteration. Is this due to differentiability of the posterior? Is there any way to avoid this problem and implement some form of importance sampling within a stan model?

Thanks for your help



You can always ship yourself some randomness in as data. While it’s not very satisfying it works and lets you implement a functional importance sampling step. It wouldn’t even break the calculation of gradients. Similarly you could implement a LCG or similar as a Stan function and use that. W.r.t. efficiency ymmv. After all they’re all deterministic functions.

Thanks for the advice, I’ve been trying to implement the randomness as data as you say. Previous discussions indicate that rng functions are allowed in the transformed data block (Generating random numbers in the model), however according to the functions reference (21.1) they may only be used in the generated quantities block. Maybe this restriction is part of a more recent update?

Yeah I’d just pass it to Stan as data or write a simple rng. You can reuse the same random numbers every iteration…