Running stan in R with only generated quantities

I am in the following situation: I already run my model in stan and got posterior sampling, but now I would like to compute other quantities like predictive distributions without resampling. For this purpose I created another stan model having as entries data covariates and posterior sampling and just a generated quantities block to compute what I need (through random-number generators).

Now my question: how should I properly run it in R? I tried calling stan command with fixed_parameters, but it is very slow and starts “sampling” (but what sampling? I do not have any parameters or model block!)

Thank you.

1 Like

You only want to run generated quantities once – make sure you called the fixed_param sampler with

fit <- stan(file='foo.stan', iter=1, chains=1, seed=596858228, algorithm="Fixed_param")

I was not specifying the number of iterations and chains. Now it does just one sampling, but when it seems to finish R crashes.
It seems to me that it is a RAM issue, but I have 8gb and I am working with something like 11 parameters, 2000 iterations and 17000 data. So actually is like 374M of “real number”. Could it be a problem with 8gb?

@federico wants to run generated quantities and have the parameters from the old model available:

I don’t think that’s possible yet through R, but it’s in the pipeline.

That’s just the default label for iterations after warmup. It is most likely sampling random quantities in generated quantities, but I agree the name is confusing. Maybe we can make those more fine grained for the fixed-param settings.

Ok, got it. Thank you.

I also tried with less data. Now I pass to stan:

  • one matrix 3300x8
  • one matrix 17000x3
    and compute:
  • one matrix 1000x17000

I run it with 16Gb RAM and still kill R. Is it me that do not realise that is too much or could be
a Stan’s problem?

If those are parameters, it’s going to be too big. Each parameter requires at least 24 bytes of memory and every operation it’s involved in consumes another 8 bytes of memory.

374M double-precision floating point (what Stan uses for real) consume 8 bytes each, so that’s only about 3GB. But that’s just one piece.

Yes, they come from rstan’s output. I thought they were just like double.

Thank you everybody!

Hello community,
I see that this topic of predict with new data and generated quantities using rstan being slow is on the forums for many years and I still have not found a great answer to it (maybe I just missed it somewhere?).
I had before fit a model with brms which run smoothly and the predict() works very fast.
However, I needed to change a small something in the model that brms couldn’t do, so I wrote my model, carefully based on the brms structure, and fit using rstan,. Till this point all went well.
Then I wrote another model code with only input data and generated quantities to get a posterior for new data.
The data is fit to Beta family, so for the fit I used target=beta_lpdf() and for the generated quantities I use Y=beta_rng().
I run the “predict” code with the “fixed_param” option, 1 chain, and 1 iter as suggested above.
It takes a very long time, RAM usage goes above 3GB (for just 1k samples), and R crashes every time.
I read somewhere that could be a rstan problem, but again I found no clear answer or what to do.
I am using Windows 10, R 4.0.1, RStudio 1.4, rstan 2.26.

Thank you very much for your great support!