How can I choose priors during Bayesian meta-regression for prevalence studies?

I am conducting a Bayesian meta-analysis and meta-regression of prevalence studies to estimate the pooled proportion of X-disease across 47 studies, and influencing factors for pooled prevalence. Each study reports the number of individuals studied (n), the number of cases (y), and four categorical covariates:

  • Quality_cat (0 = No, 1 = Yes)
  • Period_cat (0 = Period A, 1 = Period B)
  • Region_cat (0 = Africa, 1 = America, 2 = Asia)
  • Study_set_cat (0 = Clinic-based, 1 = Community-based)

My goal is to assess whether any of these categorical variables significantly influence the pooled prevalence estimates using a Bayesian meta-regression approach. My model is as follows:

model_qprs <- brm(formula = y | trials(n) ~ Quality_cat + Period_cat + Region_cat + Study_set_cat+ (1 | Study), 
+   data = dat_meta_r,family = binomial(link = "logit"),prior = priors_r,chains = 4,cores = 4,iter = 10000)

Now, my questions are:

  1. How many priors should I set for this model?
  2. I usually, set uniform prior for “Intercept”, and Half-Cauchy for “sd”, is this approach right?
    Like this:
priors_r <- c(prior("uniform(-11.51292, 11.51292)", lb = -11.51292, ub = 11.51292, class = "Intercept"),prior(cauchy(0,0.5), class="sd"))

Some of my results:
When I used the above mentioned priors the results was:

> fixef(model_qprs) %>% exp()
                 Estimate Est.Error       Q2.5     Q97.5
Intercept      0.09499462  1.479701 0.04368948 0.2058752
Quality_cat1   2.51869058  1.406699 1.30003779 4.9153949
Period_cat1    0.51401993  1.323767 0.29504304 0.8997983
Region_cat1    1.48700788  1.418034 0.75227804 2.9877303
Region_cat2    2.07207645  1.398817 1.09166429 4.0540301
Study_set_cat1 1.59081426  1.334124 0.89556351 2.7946670

Then, i specified another sets of prior like:

priors_r <- c(set_prior("uniform(-11.51292, 11.51292)", lb = -11.51292, ub = 11.51292, class = "b"),set_prior("uniform(-11.51292, 11.51292)", lb = -11.51292, ub = 11.51292, class = "Intercept"),prior(cauchy(0,0.5), class="sd"))

I think my prior might be wrong (I am not 100% sure) because for “b” for all categorical predictors, I am using same prior as “Intercept”. However, the results was not different from previous model:

> fixef(model_qprs2) %>% exp()
                Estimate Est.Error       Q2.5     Q97.5
Intercept      0.0946217  1.498652 0.04199278 0.2049134
Quality_cat1   2.5051011  1.407878 1.27128787 4.8934638
Period_cat1    0.5162192  1.321038 0.29930788 0.8938807
Region_cat1    1.4822915  1.432052 0.74764995 3.0668281
Region_cat2    2.0793034  1.411066 1.06809681 4.1348043
Study_set_cat1 1.5979427  1.332276 0.90883952 2.8205218

Then I tried different prior for “b” for all categorical predictors like this:

priors_r <- c(prior(normal(0, 1), class = "b"),set_prior("uniform(-11.51292, 11.51292)", lb = -11.51292, ub = 11.51292, class = "Intercept"),prior(cauchy(0,0.5), class="sd"))

This time resutls has changed a bit:

> fixef(model_qprs4) %>% exp()
                Estimate Est.Error       Q2.5     Q97.5
Intercept      0.1064373  1.441650 0.05235271 0.2188714
Quality_cat1   2.2784502  1.374553 1.21148724 4.2269612
Period_cat1    0.5414316  1.305883 0.32168395 0.9162455
Region_cat1    1.3844337  1.378682 0.74237612 2.5926047
Region_cat2    1.8640359  1.361793 1.00965215 3.4291363
Study_set_cat1 1.6090042  1.307058 0.94782116 2.7090814

I am not sure which prior should be the best?

Thanks in advance

In general, there is no “best” prior (if there was the whole Bayesian exercise would be a whole lot easier :) ). You might want to read through Prior Choice Recommendations · stan-dev/stan Wiki · GitHub. A common technique is to try some flavor of flat, weakly informative, and informative (if applicable) and assess the sensitivity of the results to these choices. I would personally add that sometimes one may want to consider a “skeptical” or “adversarial” prior where you try to iteratively determine how strong a prior you need to make your effects of interest “insignificant”. This might not be directly relevant to your study but sometimes come up in medical or econometrics applications.

3 Likes

I second the Stan prior wiki.

If you’re trying to assess classical significance, frequentist methods may be preferable. You can characterize uncertainty with Bayes, but you don’t get classical p-values.

Where did 11.51292 come from? Is that a physical constraint on a parameter? If not, I’d recommend unconstrained parameters with priors that concentrate where you think values are. They tend to work much better computationally and don’t degenerate at the constraint boundaries.

Hi @Bob_Carpenter, Thanks for your reply. Well, I also did my analysis in following frequentist approach. But, I want to do simillar meta-regression in Bayesian framwork. In this case, I just to get Odds ratio with 95% credible interval. To answer your quastion “Where did 11.51292 come from?”, I know there are several ways to set prior, but I used a bit premitive way (for a reason). Please, let me know, if I am wrong. Our (who works with X-disease) believe was that the prevalence of X-disease can be ranged from as lowest as closed to 0% (not exactly 0%) and as maximum as closed to 100% (not exactly 100%). Also the distribution may follow uniform distribution. Then, I assigned the belief as prior in logit scale. As

plogis(11.51292)
# [1] 0.99999999

I used “ub” of my uniform prior for brm() model as 11.51292.

Thanks again

Is there something different you’re hoping to learn?

\text{logit}^{-1}(11.5) = 0.9999899

Rather than \text{uniform}(-11.5, 11.5), I’d suggest just using \text{logistic}(0, 1). That is uniform on (0, 1) when you transform back with inverse logit. And it doesn’t leave long ugly constants lying around in the code.

If you want to share what that reason is, we can help you work through the implications for Bayesian analysis. But for logistic regression, you should look at what the uniform looks like pushed back through inverse logit.

Here’s a histogram of x = \text{logit}^{-1}(y) for draws of y \sim \text{uniform}(-11.5, 11.5).

So your prior is very strong and encouraging results near 0 or 1.

The equivalent picture of y \sim \text{logistic}(0, 1) is uniform back on the probability scale.

1 Like

HI @Bob_Carpenter Thank you so much for pointing out and visualizing the real fact behind my current prior, and suggesting a prior distribution that will serve my assumption :)