How to get samples from the posterior predictive distribtion using stan

Jean_Billie · January 29, 2019, 9:43am

My questions are

How to get a sample from the posterior predictive destribution using the rstan.
The difference of two distributions, one is a model distributions whose parameter is taken at the EAP (expected a posterior) estimate and one is a the posterior predictive destribution (PPD).

In Andrew Gelman book " Bayesian Data Analysis", he evaluates a statistical model, using the data from the posterior predicitive distribution p( \dot {} | D) for given data D. Recall that it is define by

p(y | D) = \int f(y|\theta)\pi( \theta|D )d\theta

where y is a future data, f(y|\theta) is a likelihood (model) with pamater \theta and \pi( \theta|D ) is a posterior distribution.

Using rstan, is it possible to get the data described in data block of stan file from the a posterior predictive distribution? To tell the truth my model is very complex and I am not sure the concrete form of f(y | \theta). Using lp__ samples, in a stanfit object and the Monte Carlo integral gives me one methods. But I also have to get f(y | \theta) and this is very hard for me.
I wonder there is a significant difference between the samples, one is obtained from a model distribution f( \dot {} | \theta _{\text{EAP}}) and one is from the a posterior predictive distribution?

That is, the difference of two random variables X,Y

X \sim p(y|D)

Y \sim f(y| \theta_{EAP})

where p(y|D) is a posterior predictive density and f(y| \theta_{EAP}) is a model at EAP.

jjramsey · January 29, 2019, 1:16pm

Since you already have the book Bayesian Data Analysis, I can point you to a couple relevant parts of that book. I’m using the 3rd edition.

From p. 146:

In practice, we usually compute the posterior predictive distribution using simulation. If we already have S simulations from the posterior density of \theta, we just draw one y^{\rm rep} from the predictive distribution for each simulated \theta; we now have S draws from the joint posterior distribution p(y^{\rm rep}|y).

There’s also R code on page 596 of the book that illustrates how to do what was just quoted above. For any onlookers who don’t have the book, said code can be found here: http://www.stat.columbia.edu/~gelman/book/software.pdf

Jean_Billie · January 30, 2019, 11:19am

Thank you !!
Your link helps me, since my book is the second edition which does not describe the Stan code.
Now I cannot understand the p146 sentence.

I will read the web page and try to understand.

Jean_Billie · February 1, 2019, 4:34am

Thank you !! I can understand !!

Let f( y | \theta) be a model. Then given data y_0, we can get MCMC samples \theta_1,\theta_2,...\theta_N by the Stan. From this samples we can get the sequence of models f( y | \theta_1), f(y|\theta_2), ...,f(y|\theta_N)… Drawing the data y_1,y_2,...y_N so that

y_1 \sim f( y | \theta_1),

y_2 \sim f( y | \theta_2),

\cdots

y_N \sim f( y | \theta_N),

we can also interpret that samples y_1,...,y_N are from the posterior predictive distributions. And to implement this procedure, we no longer need the Stan if such sampling can do with a package stats.

Topic		Replies	Views
Analyzing the posterior prediction samples Modeling	26	2012	October 27, 2021
Posterior predictive sampling from a user-defined probability distribution Modeling techniques	3	479	October 27, 2020
Calculation of integral using posterior predictive distribution as its measure with Stan Modeling	3	518	December 8, 2020
Bayesian inference in Stan Modeling	40	5806	July 18, 2017
CmdSTAN and posterior prediction Modeling	6	1418	March 21, 2019

How to get samples from the posterior predictive distribtion using stan

Related topics