Hello, all,
I tried to draw 640 random samples using normal_rng(0.15, 5) in Stan.
Once I got the simulated data, I computed the mean and it was close to 0.15, which is nice and good. However, the standard deviation is only 0.08. This is puzzling me.
On the other hand when I use R : data = rnorm(n=640, mean = 0.15, sd=5).On computing the standard deviation of R simulated data, I get the SD = 5, which is what I thought I should also get in Stan.
What is the reason for this difference between R and Stan normals draws?
Can any one help me with it.
Much appreciated.
Ants.
1 Like
Can you post your code? Both the Stan code and what you’re doing in R to process the results.
1 Like
Sure Stan code
data {
int<lower=0> N; // Number of obs = 640
}
generated quantities {
vector[N] error;
for (i in 1:N){
error[i]= normal_rng(0.15, 5);
}
}
Output from Stan: **
**** Variable | Mean Std. Dev. **
-------------±--------------------------------------------------------
**** error | .1539694 .0766563
In R
data = rnorm(n=640, mean = 0.15, sd=5)
sd(data)
mean(data)
Output from R :
> sd(data)
[1] 5.054347
> mean(data)
[1] 0.1379644
Thanks Mike as always for your help.
If you’re looking this summary
output from rstan
:
**** Variable | Mean Std. Dev. **
-------------±--------------------------------------------------------
**** error | .1539694 .0766563
The Std.Dev
refers to the standard deviation of the estimates across all of the iterations, not the standard deviation of the vector itself.
You can verify that the Stan and R functions both produce similar results by using the stanFunctions
function to export normal_rng
to R and compare the results:
# Provide 'dummy' arguments so that the correct signatures
# are exported
StanHeaders::stanFunction("normal_rng", mu = c(0, 0), sigma = 1)
#> [1] 0.7751841 -0.4114922
mu = rep(0.15, 100000)
stan_rng = normal_rng(mu, 5)
mean(stan_rng)
#> [1] 0.1310089
sd(stan_rng)
#> [1] 5.00299
r_rng = rnorm(100000, mean = 0.15, sd = 5)
mean(r_rng)
#> [1] 0.176546
sd(r_rng)
#> [1] 4.989584
Created on 2022-04-25 by the reprex package (v2.0.1)
2 Likes
Thanks for the explanation.
So what you are saying is, if I want to simulate 500 data points with a mean of say 0.15 and standard deviation of 5, I cannot do this with Stan alone - that is normal_rng will not do it.
I have to use R or some other platform to simulate this data?
Please correct me if I am wrong.
You can simulate it in Stan exactly as you are doing so, it’s just that the summary statistics that you are looking at don’t represent what you think they do
1 Like