Running stan in R with only generated quantities

federico · February 13, 2018, 9:02am

Hello,
I am in the following situation: I already run my model in stan and got posterior sampling, but now I would like to compute other quantities like predictive distributions without resampling. For this purpose I created another stan model having as entries data covariates and posterior sampling and just a generated quantities block to compute what I need (through random-number generators).

Now my question: how should I properly run it in R? I tried calling stan command with fixed_parameters, but it is very slow and starts “sampling” (but what sampling? I do not have any parameters or model block!)

Thank you.

betanalpha · February 13, 2018, 9:13am

You only want to run generated quantities once – make sure you called the fixed_param sampler with

fit <- stan(file='foo.stan', iter=1, chains=1, seed=596858228, algorithm="Fixed_param")

federico · February 13, 2018, 9:33am

I was not specifying the number of iterations and chains. Now it does just one sampling, but when it seems to finish R crashes.
It seems to me that it is a RAM issue, but I have 8gb and I am working with something like 11 parameters, 2000 iterations and 17000 data. So actually is like 374M of “real number”. Could it be a problem with 8gb?

Bob_Carpenter · February 13, 2018, 11:30pm

@federico wants to run generated quantities and have the parameters from the old model available:

I don’t think that’s possible yet through R, but it’s in the pipeline.

That’s just the default label for iterations after warmup. It is most likely sampling random quantities in generated quantities, but I agree the name is confusing. Maybe we can make those more fine grained for the fixed-param settings.

federico · February 14, 2018, 8:59am

Ok, got it. Thank you.

I also tried with less data. Now I pass to stan:

one matrix 3300x8
one matrix 17000x3
and compute:
one matrix 1000x17000

I run it with 16Gb RAM and still kill R. Is it me that do not realise that is too much or could be
a Stan’s problem?

Bob_Carpenter · February 17, 2018, 8:03pm

If those are parameters, it’s going to be too big. Each parameter requires at least 24 bytes of memory and every operation it’s involved in consumes another 8 bytes of memory.

374M double-precision floating point (what Stan uses for real) consume 8 bytes each, so that’s only about 3GB. But that’s just one piece.

federico · February 18, 2018, 3:08pm

Yes, they come from rstan’s output. I thought they were just like double.

Thank you everybody!

guisamor · May 27, 2021, 11:54am

Hello community,
I see that this topic of predict with new data and generated quantities using rstan being slow is on the forums for many years and I still have not found a great answer to it (maybe I just missed it somewhere?).
I had before fit a model with brms which run smoothly and the predict() works very fast.
However, I needed to change a small something in the model that brms couldn’t do, so I wrote my model, carefully based on the brms structure, and fit using rstan,. Till this point all went well.
Then I wrote another model code with only input data and generated quantities to get a posterior for new data.
The data is fit to Beta family, so for the fit I used target=beta_lpdf() and for the generated quantities I use Y=beta_rng().
I run the “predict” code with the “fixed_param” option, 1 chain, and 1 iter as suggested above.
It takes a very long time, RAM usage goes above 3GB (for just 1k samples), and R crashes every time.
I read somewhere that could be a rstan problem, but again I found no clear answer or what to do.
I am using Windows 10, R 4.0.1, RStudio 1.4, rstan 2.26.

Thank you very much for your great support!
Guilherme

Topic		Replies	Views
Specifying the number of samples for rng Modeling	6	2413	November 3, 2017
Generated quantities for prediction Modeling specification , posterior-predictive	9	583	June 5, 2024
Posterior Predictive Checks After Sampling Modeling	3	820	October 23, 2022
Getting different samples when running rstan with seed option Modeling	8	951	April 15, 2018
How does the generated quantities block iterate over sample draws? Modeling	8	2709	February 11, 2022

Running stan in R with only generated quantities

Related topics