Generating random numbers in the model

I am trying to implement a simulated likelihood model for a stochastic differential equation in stan as described in Simulation and Inference for Stochastic Differential Equations (section 3.3.2).

The manual says normal_rng can only be used in generated quantities.

All I was able to find is this discussion back from 2012.

I could generate the necessary amount of random numbers in R and pass them in as data, although that would be less efficient due to serialization/deserialization of a large amount of data.

Is there a better way?

I understand the problem of the non-deterministic lp. What if the rng inside the model could be initialized with a fixed seed at the beginning of each leapfrog step? Then the lp would be deterministic and smooth.

The reason you can’t use RNGs as part of the model definition is that Stan works by defining a differentiable log density. Using an RNG would break differentiablity as we can’t differentiate through it. It also violates Bayesian information flow principles in the same way as “cut” in BUGS did.

If you need to generate continuous “random” variables, just define parameters and give them sampling distributioins. That will not be equivalent to just generating RNGs, but it will have the correct information flow properties where the context in which the parameter is used will feed information back into the parameter.

Using an RNG would break differentiablity as we can’t differentiate through it.

I don’t see why. If you used a constant seed, the random numbers would effectively be constants — you’d have no problem differentiating through them.

If you need to generate continuous “random” variables, just define parameters and give them sampling distributioins.

This doesn’t at all sound like what I need. Primarily because I want the numbers to be truly random and not be “optimized” by stan to maximize the log probability. I suspect it would bias the result.

Not to mention all the efficiency concerns — I don’t want stan to waste time by optimizing a huge array of random numbers, store them in memory, return them as the result etc.

To clarify, the model I am thinking about looks something like this:

model {
  # simulate num_samples trajectories
  for (i in 1:num_samples)
    x = x0;
    # for each trajectory, simulate num_steps steps
    for (j in 1:num_steps) {
      z = normal_rng(0, sigma);
      x = update(x, z);
    }
    # see how well the end point of the trajectory matches our expectations
    x ~ normal(x1, sigma);
  }
}

Let me try another angle.

Stan requires you to define a joint log density proportional to the posterior if you want to do full Bayesian inference. Then the information flows from the data to the parameters.

When you use RNGs in the model, that flow is cut.

If your sigma is a constant, then you’d be generating z at random and somehow using those to define a variable x (that is also defined in terms of x0).

Sampling doesn’t optimize the log density, but the optimizer does.

I’m also not sure what you mean by “truly random”. It’s that property that cuts the flow of information.

What you’re suggesting is often done for things like multiple imputation, where it’s too costly to jointly analyze. And it’s done using the cut operation in BUGS. But we don’t support these things in Stan because they’re not based on a model plus full Bayesian inference. Instead, it’s procedural.

Sampling doesn’t optimize the log density, but the optimizer does.

Yes, at the moment I’m mainly interested in ML/MAP.

What you’re suggesting is often done for things like multiple imputation, where it’s too costly to jointly analyze. […] But we don’t support these things in Stan because they’re not based on a model plus full Bayesian inference. Instead, it’s procedural.

Yes, it is multiple imputation in a way. Now I see what you were suggesting above when you told me to make it a parameter — thanks. But this amounts to a very different computational method. I’d be curious to see how the two compare — but in order to compare them, I’d still need to implement the original simulated likelihood, probably by passing in random numbers as part of the data.

You can generate random numbers in the transformed data block. There’s just not a good way to pass in more general samples other than through data.

Even with optimization, we can’t differentiate through the RNG.

What you plan todo sounds very similar to what we have done in our paper:

https://www.e-publications.org/ims/submission/AOAS/user/submissionFile/32344?confirm=157ec8b6

We pass in a bunch of random numbers into Stan and we also give recipes to ensure convergence of results. The supplements contain the sources and I am happy to share them if this is useful.

1 Like

I am very interested in reading your supplements of the source code, but I couldn’t open it from the journal website. Do you mind sharing with me? Thanks!

I got access to download the supplements now; could you point out where you pass in the random numbers? The code is pretty long … Thanks!

Cool that someone is diving into this! I recall that the simulation example is straightforward. So start with that. I can dig up the code tomorrow and have a look in case you are still struggling. Note that I generated the random numbers within R if my memory serves me right.

EDIT: Now I remember, I passed in the chain_id into the stan program. The Stan model then selected out of a big array the sub-slice of random numbers which were all generated in R.

I checked the code of linear_baad.R and pkpd_baad.R, but I am not sure which lines of code did you refer to. I saw the chain id was fixed at 4 in both files.

Hi!

I am using 4 chains at most. The random numbers are generated in the linear_baad.R script on line 161 (named xi). This is an array of length 4 over a matrix (as many chains I have) and in the stan program I am selecting the set of random numbers according to the CHAIN_ID which is handled within the sampling command. you can see that CHAIN_ID is varied as I am printing it when the Stan program starts.

Best,
Sebastian

1 Like

Thank you so much! I was able to generate random numbers in R and then pass them into Stan as you suggested. But I still have a problem with selecting the set of random number according to the CHAIN_ID. I saw that in linear_baad.R line 161 you generated random numbers in R, and then passed into Stan in line 164-166. But I didn’t understand how Stan knows which set of random numbers to use according to CHAIN_ID (in your linear_mvn_approx.stan line 82 and 142). I tried to run your code so that I could understand better, but it ran into an error “value for ‘pstream__’ not found .” If you could provide me more explanations on this, that would be great!

The model I wrote is rather hard for the compiler and it can be that it does not compile at the moment. I would recommend you to use rstan 2.17 which is what I used at the time, I think.

The CHAIN_ID is generated on the fly from rstan and is passed as data to the Stan program. This is not supported by cmdstan, for example.

If you have to run this with a newer Stan version, then I can look into the matter and get it running again.

It works for me now! Your code helps substantially!

@serenejiang: Hi, I’m working on something similar to this idea where I need to generate random quantity in model block. Could you please share the code in @wds15’s paper with me

Thanks

The code can be found in the supplementary materials of the paper. Here is the link to download the supplementary materials: https://projecteuclid.org/euclid.aoas/1536652966#supplemental

@ppho Please see the link here: https://projecteuclid.org/euclid.aoas/1536652966#supplemental

Bob, would it be possible to use one different random number in each different iteration?

I am trying to implement a two-step estimator. Estimates from the first step become data for the second stage, but should not be influenced by the second step. The second step needs to account for the uncertainty in the estimates from the first step.

What I am doing so far is estimate the first step and store the chains. Then pass the mean and sd to the second step and use them to generate a parameter, but it seems the draws are still influenced by the likelihood of the second step.

This should be trivial to do in a single model, but we should probably start a new topic for it. Can you do so, while posting the code you have for the two separate models, then I should be able to advise on how to properly combine.