Simulating multiple datasets from posterior predictive distribution

edmund1 · January 26, 2021, 12:25pm

I am aware that I can use the generated quantities block to simulate from the posterior predictive distribution. However, it seems that I can only generate one dataset everytime. How would I go about generating multiple datasets (say 100)?

data {
  int<lower=0>N;
  vector[N]y;
  vector[N] x;
}
parameters {
  real beta0;
  real beta1;
  real<lower=0> sigma;
}
model{
  y ~ normal(beta0 + beta1*x, sigma);
  beta0 ~ normal(0,1);
  beta1 ~ normal(0,1);
  sigma ~ gamma(1,1);
}
generated quantities {
  vector[N]y_pred;
  for (n in 1:N){
     y_pred[n] = normal_rng(beta0 + beta1*x[n], sigma);
  }

mike-lawrence · January 26, 2021, 12:40pm

Specifying the number of datasets as M:

generated quantities {
  matrix[N,M] y_pred;
  for (n in 1:N){
    for(m in 1:M){
       y_pred[n,m] = normal_rng(beta0 + beta1*x[n], sigma);
    }
  }

edmund1 · January 26, 2021, 1:35pm

Thanks for your reply. I can see the 100 datasets when I extract the fit. Do you know how I could plot density curves for all of the datasets on the same plot?

JimBob · January 26, 2021, 4:29pm

I think you would need to show the data frame format you have the data in to answer that question, as it might require a few steps and will depend on whether e.g., you are using r or python or something else.

In general in R you could put all the data into long form - one column with all the data points and another column indicating which sample those points come from. Then something like:

ggplot(dataframe) +
geom_density(aes(x = estimate, color = sample))

Alternatively, select a subsample, say 25 samples and plot them in a grid:

ggplot(dataframe) +
geom_density(aes(x = estimate) +
facet_wrap(~sample)

Topic		Replies	Views
Specifying the number of samples for rng Modeling	6	2411	November 3, 2017
Generating values from multivariate normal distribution Modeling cognitive-science	4	913	December 21, 2018
Generate data from prior Modeling	5	1850	February 18, 2020
Checking understanding of posterior predictive distribution General	3	1208	December 15, 2017
Generated quantities for prediction Modeling specification , posterior-predictive	9	521	June 5, 2024

Simulating multiple datasets from posterior predictive distribution

Related topics