Dose the posterior predictive p value calculated with an improper prior have any problem?

We can calculate Posterior Predictive P value regardless of whether or not a prior is proper.

Dose the p value calculated with improper prior have any problem?

For a test statistic T(y|\theta), PPP is calculated by

\iint \mathbb{I}[T(y,\theta) >T(y_{obs},\theta) ] p(y|\theta)p(\theta|y_{obs}) \, dy \, d\theta\\ \approx \sum_i \int \mathbb{I}[T(y,\theta_i) >T(y_{obs},\theta_i) ] p(y|\theta_i)\\ \approx \sum_i \sum_j \mathbb{I}[T(y_i^j,\theta_i) >T(y_{obs},\theta_i) ]\\

where \theta_i is MCMC samples from posterior and y_i^j, j=1,2,3,... is a samples drawn from アlikelihood function at parameter \theta_i.

I think that as long as you have a proper posterior (and all other convergence metrics are valid), you can do whatever you want with your MCMC samples, independently of the prior being proper or improper.

As for the use of PPP, do you really need to produce a p-value? Wouldn’t showing a posterior uncertainty interval be more consistent with a Bayesian analysis?

2 Likes

Thank you for reply.

Now, I do not reveal whether a posterior is proper when a prior is improper in case of my model.
If a posterior is improper, then such PPP is not reliable?

I am planning to show both a PPP and posterior uncertainty interval [a,b] for each model parameters, where [a,b] is defined by \int ^b_a p(\theta|y)d\theta=95\%.

When can we say that the interval is more consistent? If interval is small (What compared?), then such an estimate can be said more consistent?

If the posterior is improper, it’s just not suitable for any sort of inference. It’s useless. So when employing improper priors it’s important to able to check that the resulting posterior is proper.

Thank you for reply.

So, I have to verify whether or not a posterior is proper.

Yup. And that’s precisely why improper priors are a bit frowned upon; because if your model is high-dimensional or very complicated, it becomes virtually impossible to prove/show that the posterior is proper.

In numerical perspective, improper prior seems better for various data-sets .(my coarse consideration).
But theoretically, it should not allowed.
My complicated model makes it difficult to obtain suitable prior nor to reveal whether a posterior is proper when prior is not.

Well, if you’re allowed to share the model code, I suggest you do that. Even better if you can write down the mathematical formulation of the model. It’s quite possible that there exists a weakly-informative prior that has good performance and would remove the preocupation with impropriety.

Thank you, I will. To tell the truth, I uploaded my model codes but not in here.
I’m hesitant because my codes or model is so complicated.

1 Like