How to specify weakly informative priors for beta regression with 10 predictors?

Hi all,

I’m running a beta regression with 10 continuous predictors. I’m fairly new to Bayesian modeling (about a week in), and I’m trying to set weakly informative priors. I specified the following after standardizing my predictor:

  • Intercept: Normal(-0.5, 1.2) — based on the mean and SD from a previous study
  • Beta coefficients: Normal(0, 1)
  • Phi : gamma(0.1, 0.1) (the brms default)

When I simulate outcome data from these priors, most simulated y values are very close to 0 or 1. I assume it is because the number of predictors combined with liberal betas adds up.
I tried tightening the beta priors, but I worry this might make them too informative. I’m confused about how to proceed and how to balance having weakly informative priors while keeping the prior predictive distribution reasonable. I would be really grateful for any advice on this issue :)

Good on you for checking the prior predictive distribution. Shrinkage priors were developed exactly for this reason. I would recommend looking into the R2D2 prior paradigm, which is nicely implemented in brms and is an elegant solution to decomposing the explained variance between your predictors.

2 Likes

Hi, @spn, and welcome to the Stan forums!

The problem you’re running into is that independent weakly informative priors do not add up to a multivariate weakly informative prior. This was the point of the (arguably misleadingly named) paper that led us to take prior predictive checks more seriously:

As @mhollanders pointed out, you probably need a better weakly informative joint prior from which to simulate. I don’t know if the R2D2 prior easily admits simulation, but @bgoodri will know.

Also, it’s OK to simulate from a more constrained prior than you will use to fit as long as there’s enough data to resolve the model. You can’t do strictly proper simulation based calibration this way, but I believe it’s what most of us do in practice.

1 Like

Thank you so much for the reply and explanation @mhollanders and @Bob_Carpenter :) I will try working it out with R2D2 prior! @Bob_Carpenter, I am not sure if I understood your last point correctly: if the data is strong enough, the posterior will be dominated by it, so is it okay to use a slightly different (more constrained) prior for prior predictive checks? I might be misunderstanding.