What Steps after Posterior Predictive Checks

kimyou49 · August 16, 2023, 6:25am

Hello,

I am seeking assistance regarding the necessary steps after implementing posterior predictive checks. I am replicating Jeffrey Arnold’s polling aggregation model using my own dataset.

See Simon Jackman’s Bayesian Model Examples in Stan (jrnold.github.io))

The ‘ppc dens overlay’ analysis indicates that the simulated data points do not align well with the observed data.

Similarly, the ‘ppc stat_2d’ visualization illustrates that the simulated data points tend to cluster at values lower than the mean of the observed data, as well as higher than the standard deviation of the observed data.

My inquiry is: what measures can be undertaken to mitigate the disparities between the observed data and the simulated data points?

Here’s Stan code:
parameters {
vector[T] omega_raw;
real<lower = 0.> tau;
vector[H] eta_raw;
real<lower = 0.> zeta;
}
transformed parameters {
vector[N] mu;
vector[T] xi;
vector[H] eta;
eta = eta_raw * zeta;
xi[1] = xi_init_loc + omega_raw[1]*xi_init_scale;
for (t in 2:T) {
xi[t] = xi[t - 1] + omega_raw[t]tau;
}
for (i in 1:N) {
mu[i] = xi[time[i]] + eta[house[i]];
}
}
model {
eta_raw ~ normal(3.5, 5.0); // eta_raw ~ normal(0., 1.);
zeta ~ normal(4.5, zeta_scale); // zeta ~ normal(0., zeta_scale);
tau ~ cauchy(0., 2.95tau_scale); // tau ~ cauchy(0., tau_scale);
omega_raw ~ normal(5.0, 7.0); // omega_raw ~ normal(0., 1.);
y ~ normal(mu, s);
}

I’ve made an attempt to adjust the priors by increasing their values, aiming to illustrate the extent of this adjustment in comparison to Arnold’s original model. However, despite these adjustments, the posterior predictive checks show minimal change."

I would appreciate your input on the following question: What steps could be taken to alleviate the disparities between the observed data and the simulated data points? Any comments you could provide would be highly valued.

jd_c · August 16, 2023, 8:41pm

Some things you might consider are: perhaps your model is missing key information or structure (for example, perhaps you don’t have predictors that are needed), perhaps your model isn’t flexible enough (for example, maybe you fit a linear function for the mean but it is nonlinear), or perhaps your response family doesn’t adequately describe the probability distribution that can be thought to generate the process. I think the first step that is always best is to go back to the drawing board and rethink the conceptual part of the analysis and use the discrepancies that you see between the posterior predictions and your data to motivate your thinking as to the modeling process.

When I see something like that small second peak in the density, I start thinking about predictors that I am missing. Perhaps the model is sort of averaging over those two peaks. You could also try a more flexible response family like student t.

Topic		Replies	Views
Posterior Predictive Checks After Sampling Modeling	3	820	October 23, 2022
Using posterior samples as data for next model Modeling techniques , fitting-issues , state-space	5	1164	May 27, 2020
Posterior predictive checking Modeling	10	2041	November 26, 2019
Bayesian inference in Stan Modeling	40	5807	July 18, 2017
Quick question about graphical prior predictive check Modeling	5	621	October 10, 2022

What Steps after Posterior Predictive Checks

Related topics