Hi, I have a quick question about how exactly Stan computes the mean and quantiles for generated quantities. Let’s say I have the following model, I simplify the one I am currently using.
model {
vector[N] mu = alpha + b_1 x_1 + b_2 x_2;
y ~ normal(mu, sigma);
}
generated quantities{
g = a + b_1*5 + b_2*10
}
The issue arises from the fact that if I mechanically calculate g using the output means for a, b_1 and b_2 I do not get the same as the mean that Stan output for g, why is that? Theoretically the mean of g taken over all the simulations and chains should be the same as summing the output means for the three parameters.
Another concern is the quantification of uncertainty, I understand that in this case covariance also plays a role and one cannot simply mechanically construct the variance (which is why I am using the generated quantity function, but are the ouput quantiles taken over all the simulations?
Thanks in advance for the help
Try:
generated quantities {
g = a + b_1*5.0 + b_2*10.0
}
Do you get the same values if you calculate g over all draws in R/Python first and then calculate its mean.
I use the command print and summary to get results, can that be an issue?
Probably not. Do you use cmdstan interface?
Is there a programming language you use to do calculations? (E.g. python / R)
If you calculate g per draw from posterior, do you get the same g as in Stan? If you do then g from means is different from mean of g.
I use Stan through RStudio. I use the print command from the Stan package and I compare the value Stan prints g^{print} with g^{manual} = a^{print} + 5 b_1^{print} + 10b_2^{print}, where I manually construct the latter value, using the mean parameter values Stan prints.
My question is, should they be equal by construction or not? And if not why?
So I guess you use CmdStanR interface?
I think you could use posterior package to extract draws and then calculate g manually.
You could also do some plotting to see if there is correlation with parameters.