Hi,

I have a question regarding how the prediction samples are drawn if using the generated quantities block in stan. Take the following simple codes as an example.

data{

int<lower=0> T; // number of samples

int<lower=0, upper=1> y[T]; // observed samples

real<lower = 0> shape1; // hyperparameter for beta prior

real<lower = 0> shape2; // hyperparameter for beta prior

int<lower=0> T_new; // number of predictions

}

parameters{

real<lower=0, upper=1> p;

}

model{

p ~ beta(shape1, shape2);

y ~ bernoulli(p);

}

generated quantities{

int<lower=0, upper=1> y_new[T_new];

for (t in 1:T_new){

y_new[t] = bernoulli_rng(p);

}

}

I use beta prior with Bernoulli / binomial likelihood, and the new predictions are based on posterior of p, which we know is also a beta distribution.

For example, we only want to predict next observation, that T_new = 1. If we set iter = 2000, chains = 4 in sampling, I notice that Stan will draw the 4000 samples of parameter p, and 4000 samples of y_new, that is, the sample number of parameters and predicted quantities would be the same. So I am very curious about how stan does this.

I assume if we do prediction out of stan, we can sample from the posterior parameter space, and for each parameter sample,we draw a lot of samples for y_new, say if we draw 1000 parameters from the posterior space, and draw 1000 samples for predicted y_new per parameter sample, we should finally have 1 million samples for y_new, which form the posterior distribution of y_new given y, p(y_new|y).

But with Stan, it seems only one sample is drawn for the prediction per sample of parameter. And this puzzles me.

Thanks!