Getting different samples when running rstan with seed option

Jaslene_Lin · March 31, 2018, 10:58am

Dear stan users and developers:

I am currently running a very simple experiment to generate some data from my proposed model with ‘known’ parameter values and then fit these generated data to the model to examine whether the model can recover the ‘known’ parameter values.

However, I got different sets of generated data(the data are generated in generated quantities block with normal_rng function and I supply ‘known’ mean and standard deviation in the data block) when simulating from my proposed model with ‘known’ parameters.
I set seed = 123456 within stan() command, with default 2000 iterations on 4 chains and I think each of the resulting 4000 will be a random sample of data that I can then use to fit to the model to estimate the ‘known’ parameter values.

so I choose 4000th iteration, but when I ran it a second time and still chose the 4000th iteration, the generated data are different to the previous one. I am wondering where did it go wrong?

hope for some clarification, thanks in advance

mitzimorris · March 31, 2018, 4:11pm

which interface are you using?

Jaslene_Lin · March 31, 2018, 4:53pm

I am using rstan

mitzimorris · March 31, 2018, 9:40pm

the Rstan extract method has parameter permute which should be set to false

this might help:
https://cran.r-project.org/web/packages/rstan/vignettes/stanfit-objects.html

Jaslene_Lin · March 31, 2018, 9:47pm

Thank you for your reply. what you meant this that each time i extract the stanfit object, it changes the ordering of the iterations and what I specify as the 4000th iteration of generated data is not the same one as the ordering has changed. I set permuted = TRUE but it will give me a matrix of 4 columns each represent the chain but I don’t know what are the rows correspond to the parameters in my model?

mitzimorris · April 1, 2018, 2:26pm

run one chain.

mitzimorris · April 1, 2018, 2:31pm

set permute=FALSE

Bob_Carpenter · April 13, 2018, 7:09pm

That defeats using convergence diagnostics like R-hat.

This should work in general, even with multiple chains. Look at the vignette for rstan about extracting posterior information on the order of all the arguments.

You don’t need to run 1000 sampling iterations per chain. Running to n_eff of 200 or so (50 per chain) should be sufficient (n_eff will only go to 200 if R-hat gets close to 1), which probably won’t take 4000 posterior draws.

jonah · April 15, 2018, 5:14am

Now also hosted at

All of our R package vignettes are now available in higher quality than the ones on CRAN (size limitations).

Topic		Replies	Views
Running stan in R with only generated quantities RStan	7	2470	May 27, 2021
Random subsampling using the extract function Modeling rstan , fitting-issues	0	338	May 23, 2023
Different Outputs in RStan vs. PyStan Interfaces	28	3006	April 19, 2020
Slightly discrenpancy between Rstan and Cmdstanr output General	2	500	February 1, 2021
Different results on the same data with the same seed Modeling fitting-issues	1	322	May 18, 2023

Getting different samples when running rstan with seed option

Related topics