My questions are

How to get a sample from the posterior predictive destribution using the rstan
.

The difference of two distributions, one is a model distributions whose parameter is taken at the EAP (expected a posterior) estimate and one is a the posterior predictive destribution (PPD).
In Andrew Gelman book " Bayesian Data Analysis", he evaluates a statistical model, using the data from the posterior predicitive distribution p( \dot {}  D) for given data D. Recall that it is define by
p(y  D) = \int f(y\theta)\pi( \thetaD )d\theta
where y is a future data, f(y\theta) is a likelihood (model) with pamater \theta and \pi( \thetaD ) is a posterior distribution.

Using rstan, is it possible to get the data described in data block of stan file from the a posterior predictive distribution? To tell the truth my model is very complex and I am not sure the concrete form of f(y  \theta). Using lp__
samples, in a stanfit
object and the Monte Carlo integral gives me one methods. But I also have to get f(y  \theta) and this is very hard for me.

I wonder there is a significant difference between the samples, one is obtained from a model distribution f( \dot {}  \theta _{\text{EAP}}) and one is from the a posterior predictive distribution?
That is, the difference of two random variables X,Y
X \sim p(yD)
Y \sim f(y \theta_{EAP})
where p(yD) is a posterior predictive density and f(y \theta_{EAP}) is a model at EAP.
1 Like
Since you already have the book Bayesian Data Analysis, I can point you to a couple relevant parts of that book. I’m using the 3rd edition.
From p. 146:
In practice, we usually compute the posterior predictive distribution using simulation. If we already have S simulations from the posterior density of \theta, we just draw one y^{\rm rep} from the predictive distribution for each simulated \theta; we now have S draws from the joint posterior distribution p(y^{\rm rep}y).
There’s also R code on page 596 of the book that illustrates how to do what was just quoted above. For any onlookers who don’t have the book, said code can be found here: http://www.stat.columbia.edu/~gelman/book/software.pdf
6 Likes
Thank you !!
Your link helps me, since my book is the second edition which does not describe the Stan code.
Now I cannot understand the p146 sentence.
I will read the web page and try to understand.
Thank you !! I can understand !!
Let f( y  \theta) be a model. Then given data y_0, we can get MCMC samples \theta_1,\theta_2,...\theta_N by the Stan. From this samples we can get the sequence of models f( y  \theta_1), f(y\theta_2), ...,f(y\theta_N)… Drawing the data y_1,y_2,...y_N so that
y_1 \sim f( y  \theta_1),
y_2 \sim f( y  \theta_2),
\cdots
y_N \sim f( y  \theta_N),
we can also interpret that samples y_1,...,y_N are from the posterior predictive distributions. And to implement this procedure, we no longer need the Stan if such sampling can do with a package stats
.
1 Like